473,320 Members | 1,821 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

Need help with algorithm

MJ
Hello, I need some suggestions on how to do this.

I am reading through a large iis loge. For each line, I get three datapoints
(sessionID, Time, Stage). I need to store these values into some collection,
but not sure which one. Here is how they are related.

For each sessionID, there is one or more Time points. For each Time point
related to the sessionID, there is one Stage.

So we could have somehting like this:

Sid, Time, Stage
123, 12:00, search
543, 12:00, addToCart
123, 12:01, addToWishlist
657, 12:03, fillForm
123, 12:06, logIn
543, 12:06, logOut

In the end, I need this:

Sid: 123
search->addToWishlist->logIn

Sid: 543
addToCart->logOut

Sid: 657
fillForm

What I want to do is use the Time to sort the stages in order that the user
performed the steps. I'm up to the point where I looping through each line
of the log and then use regular expressions to get the 3 data points. Now
how should I store them? Something tells me a hashTable is the way to go,
but I don't know how to implement it using 3 data points. I think Sid+Time
should be the unique key with the stage being the value for the hash.

So far I've tried creating two hashTables (X,Y). X has Time as key and stage
was value. Y has SessionID as key and X as value, but this isn't working
because Time is not unique since other users are doing stuff at the same
time.

Any suggestions?
Nov 29 '06 #1
5 1340
MJ wrote:
For each sessionID, there is one or more Time points. For each Time point
related to the sessionID, there is one Stage.
Sid: 123
search->addToWishlist->logIn

Sid: 543
addToCart->logOut

Sid: 657
fillForm
This suggests that what you want is a HashTable (or Dictionary), keyed
by session ID, and holding ArrayList-s (or List<>-s) of a Time/Stage
pair.

Dictionary<SessionID, List<TimeStagePair>>

--

..NET 2.0 for Delphi Programmers
www.midnightbeach.com/.net
What you need to know.
Nov 29 '06 #2
On Tue, 28 Nov 2006 21:15:08 -0500, "MJ" <ma*****@hotmail.comwrote:
>Hello, I need some suggestions on how to do this.

I am reading through a large iis loge. For each line, I get three datapoints
(sessionID, Time, Stage). I need to store these values into some collection,
but not sure which one. Here is how they are related.

For each sessionID, there is one or more Time points. For each Time point
related to the sessionID, there is one Stage.

So we could have somehting like this:

Sid, Time, Stage
123, 12:00, search
543, 12:00, addToCart
123, 12:01, addToWishlist
657, 12:03, fillForm
123, 12:06, logIn
543, 12:06, logOut

In the end, I need this:

Sid: 123
search->addToWishlist->logIn

Sid: 543
addToCart->logOut

Sid: 657
fillForm

What I want to do is use the Time to sort the stages in order that the user
performed the steps. I'm up to the point where I looping through each line
of the log and then use regular expressions to get the 3 data points. Now
how should I store them? Something tells me a hashTable is the way to go,
but I don't know how to implement it using 3 data points. I think Sid+Time
should be the unique key with the stage being the value for the hash.

So far I've tried creating two hashTables (X,Y). X has Time as key and stage
was value. Y has SessionID as key and X as value, but this isn't working
because Time is not unique since other users are doing stuff at the same
time.

Any suggestions?
Based on what I am seeing in your example of the data, there appears to be no
unique identifier for any of the three data elements.

If you are only going to use the list to step through a loop, using the Time
element would make sense. However, using the SID and Time for the key might not
work unless Time has more resolution than to the minute. Users can do a lot of
things with a computer within one minute, so you would probably end up with
duplicate keys if the resolution were only one minute. To use that combination
of key it MIGHT be safe if the resolution of the log is in ticks or
microseconds.
Good luck with your project,

Otis Mukinfus
http://www.arltex.com
http://www.tomchilders.com
Nov 29 '06 #3
MS
Thank Otis. The time actually has the seconds as well. I just neglected to
put it in my example. So the format is hh:mm:ss. The only way a user can do
the same action at the same second is if they accidentally double-click on a
link. I would like to ignore these.

"Otis Mukinfus" <ph***@emailaddress.comwrote in message
news:vt********************************@4ax.com...
On Tue, 28 Nov 2006 21:15:08 -0500, "MJ" <ma*****@hotmail.comwrote:
>>Hello, I need some suggestions on how to do this.

I am reading through a large iis loge. For each line, I get three
datapoints
(sessionID, Time, Stage). I need to store these values into some
collection,
but not sure which one. Here is how they are related.

For each sessionID, there is one or more Time points. For each Time point
related to the sessionID, there is one Stage.

So we could have somehting like this:

Sid, Time, Stage
123, 12:00, search
543, 12:00, addToCart
123, 12:01, addToWishlist
657, 12:03, fillForm
123, 12:06, logIn
543, 12:06, logOut

In the end, I need this:

Sid: 123
search->addToWishlist->logIn

Sid: 543
addToCart->logOut

Sid: 657
fillForm

What I want to do is use the Time to sort the stages in order that the
user
performed the steps. I'm up to the point where I looping through each line
of the log and then use regular expressions to get the 3 data points. Now
how should I store them? Something tells me a hashTable is the way to go,
but I don't know how to implement it using 3 data points. I think Sid+Time
should be the unique key with the stage being the value for the hash.

So far I've tried creating two hashTables (X,Y). X has Time as key and
stage
was value. Y has SessionID as key and X as value, but this isn't working
because Time is not unique since other users are doing stuff at the same
time.

Any suggestions?

Based on what I am seeing in your example of the data, there appears to be
no
unique identifier for any of the three data elements.

If you are only going to use the list to step through a loop, using the
Time
element would make sense. However, using the SID and Time for the key
might not
work unless Time has more resolution than to the minute. Users can do a
lot of
things with a computer within one minute, so you would probably end up
with
duplicate keys if the resolution were only one minute. To use that
combination
of key it MIGHT be safe if the resolution of the log is in ticks or
microseconds.
Good luck with your project,

Otis Mukinfus
http://www.arltex.com
http://www.tomchilders.com

Nov 29 '06 #4
MS
Thanks Jon. I'll look into this.
"Jon Shemitz" <jo*@midnightbeach.comwrote in message
news:45***************@midnightbeach.com...
MJ wrote:
>For each sessionID, there is one or more Time points. For each Time point
related to the sessionID, there is one Stage.
>Sid: 123
search->addToWishlist->logIn

Sid: 543
addToCart->logOut

Sid: 657
fillForm

This suggests that what you want is a HashTable (or Dictionary), keyed
by session ID, and holding ArrayList-s (or List<>-s) of a Time/Stage
pair.

Dictionary<SessionID, List<TimeStagePair>>

--

.NET 2.0 for Delphi Programmers
www.midnightbeach.com/.net
What you need to know.

Nov 29 '06 #5
MS
By the way, I was able to do this with Perl with someone's help. I'm just
trying to convert it to a .net console app.

For those here that know Perl, it might shed some light on what I'm trying
to duplicate:

#Add the unique combinations into a hash
($userHash{$iisSID}{$iisTime}{$iisStageID})++;

From what I've been told, the "++" at the end is supposed to get rid of
duplicate combinations of sid,time, and stage. In other words, get rid of
the double-clicks.


"MJ" <ma*****@hotmail.comwrote in message
news:u4**************@TK2MSFTNGP03.phx.gbl...
Hello, I need some suggestions on how to do this.

I am reading through a large iis loge. For each line, I get three
datapoints (sessionID, Time, Stage). I need to store these values into
some collection, but not sure which one. Here is how they are related.

For each sessionID, there is one or more Time points. For each Time point
related to the sessionID, there is one Stage.

So we could have somehting like this:

Sid, Time, Stage
123, 12:00, search
543, 12:00, addToCart
123, 12:01, addToWishlist
657, 12:03, fillForm
123, 12:06, logIn
543, 12:06, logOut

In the end, I need this:

Sid: 123
search->addToWishlist->logIn

Sid: 543
addToCart->logOut

Sid: 657
fillForm

What I want to do is use the Time to sort the stages in order that the
user performed the steps. I'm up to the point where I looping through each
line of the log and then use regular expressions to get the 3 data points.
Now how should I store them? Something tells me a hashTable is the way to
go, but I don't know how to implement it using 3 data points. I think
Sid+Time should be the unique key with the stage being the value for the
hash.

So far I've tried creating two hashTables (X,Y). X has Time as key and
stage was value. Y has SessionID as key and X as value, but this isn't
working because Time is not unique since other users are doing stuff at
the same time.

Any suggestions?

Nov 29 '06 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: pak76 | last post by:
Class SignedXml is used to produce/verify signature over XML document. One of its methods, function GetIdElement, is used to select Xml elements for signature and verification and consist following...
14
by: Nikola | last post by:
I need a function that reads from a txt file and randomly chooses a line from which retrieves a string (with spaces) and returns it to main function. thx
34
by: Mark Kamoski | last post by:
Hi-- Please help. I need a code sample for bubble sort. Thank you. --Mark
10
by: Nemok | last post by:
Hi, I am trying to write an additive encryption algorithm in C++ that will encrypt a text by adding a random numer to each character in a string. The code looks similar to this: for(int...
3
by: Charleees | last post by:
Hi all, I need C# code for Implementing MD5 Algorithm.. Hope all would have heard of MD5 Algorith... Does any one have the C# coding for that Algorithm.. please Send... ITs URgent..... Thanks...
1
by: Charles | last post by:
Hi all, I need C# code for Implementing MD5 Algorithm.. Hope all would have heard of MD5 Algorith... Does any one have the C# coding for that Algorithm.. please Send... ITs URgent..... Thanks...
3
by: Nick Valeontis | last post by:
Hi to all! I am writing an implentation of the a-star algorithm in c#. My message is going to be a little bit long, but this is in order to be as specific as possible. My question has nothing to...
2
nabh4u
by: nabh4u | last post by:
hi, i need some help with progamming..i have a program which has to implement gale shapley's algorithm. i have 2 preference lists one is for companies and the other is for persons. i have to match...
6
by: StephQ | last post by:
I need to implement an algorithm that takes as input a container and write some output in another container. The containers involved are usually vectors, but I would like not to rule out the...
0
by: chrisotreh | last post by:
hi everyone, i need a simple code of IDA* algorithm. this algorithm is a method of heuristic search.this algorithm is the result of enhancement of Depth First Search combined with A* algorithm.. ...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
0
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.