473,785 Members | 2,434 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

All records within x minutes of each other

Consider a table that holds Internet browsing history for users/machines,
date/timed to the minute. The object is to tag all times that are separated
by previous and subsequent times by x number of minutes or less (it could
vary, and wouldn't necessarily be a convenient round number). This will
enable reporting "active time" for users (a dubious inference, but hey).

There are a lot of derivative ways of seeing this information that might be
good to get to. What's the fist and last of these sets of times? What
percentage of a given period is spanned by active times, and not? What is
the average duration of such periods? What is the average interval between
web hits during such periods? During other times?

Blah, blah. The basic problem is my principal problem. I don't have much
experience with cursors, but from what I understand it would be very good
indeed to spare them, given the number of records I anticipate working
with.

I'd be glad of any pointers.

--

Scott
Aug 30 '05
29 4356
On 30 Aug 2005 13:28:36 -0700, --CELKO-- wrote:
Basics: Time comes in durations, so evetns have a start and stop time.
Hi Joe,

Really?

In the Usenet headers of your message is this line:NNTP-Posting-Date: Tue, 30 Aug 2005 20:28:41 +0000 (UTC)

This denotes the time you decided to hit the "send" button (or whatever
it's called in your software) and publish your message to the Usenet.
Please tell me the start and stop time of posting this message?

On my desk is a letter. The poststamp on the envelope is stamped by the
Dutch postal service. This stamp includes a date: "22 VIII 05". Please
don't tell me that this means that they started stamping it on midnight
and took a full 24 hours before the stamping was done.

Think about tracking when a web advertisement was served. The NNTP
protocol can't track how long I look at the ad. (IIRC, it's even
impossible to track if I have an ad blocker active). All web advertising
contracts are based on how often the ad is served. What is the start and
stop time of serving an ad?

How about police work? An officer is checking the streets, and at 11:47
AM he sees you driving through a red light. What's the start and stop
times of that? Or if you are caught speeding? Sure, you started speeding
before the officer caught you, and you might have continued after that,
but there's no way that the dept of Justice will ever find out - but
they do know the exact time that an officer of the law saw on his
equipment that you were driving 7.3 mph too fast.

Need I go on, or do you now have enough examples to know that in the
real world, time does NOT always have start and stsop time.

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)
Aug 31 '05 #11
AK opined thusly on Aug 30:
That done, you need to match every beginning to its corresponding end.
This is very simple using row_number() available in SQL 2005
In earlier versions, you can either emulate row_number() using
identity() column in a result set, or use a join condition like this:

...
from beginnings b join ends e on b.time<e.time
where not exists(select 1 from beginnings b1 where b.time< b1.time and
b1.time<e.time)
and not exists(select 1 from endings e1 where b.time< e1.time and
e1.time<e.time)


OK, here's the result in all its gory(sic). Still playing. It's interesting
to vary @interval and see the consequences. Now I have to figure out how to
justify any particular value for that. Geeez . . .

| CREATE FUNCTION ie_begin(@inter val datetime)
| RETURNS TABLE
| AS
|
| Return
| (
| select i0.Username, i0.vDate
| from ieHist i0 where not exists
| (
| select vDate from ieHist where vDate > i0.vDate - @interval and vDate < i0.vDate and i0.username = username
| )
| or exists
| (
| select vDate from ieHist where vDate < i0.vDate - @interval and vDate > i0.vDate and i0.username = username
| )
| group by i0.username, i0.vDate
| )
| go

| CREATE FUNCTION ie_end(@interva l datetime)
| RETURNS TABLE
| AS
|
| Return
| (
| select i0.Username, i0.vDate
| from ieHist i0 where not exists
| (
| select vDate from ieHist where vDate < i0.vDate + @interval and vDate > i0.vDate and i0.username = username
| )
| or exists
| (
| select vDate from ieHist where vDate > i0.vDate + @interval and vDate < i0.vDate and i0.username = username
| )
| group by i0.username, i0.vDate
| )
| go
And here's my sandbox:

| declare @interval as datetime
| declare @beginnings table (username varchar(30), vtime datetime)
| declare @ends table (username varchar(30), vtime datetime)
| set @interval = '00:10'
| insert into @beginnings (username, vtime) select * from ie_begin(@inter val)
| insert into @ends (username, vtime) select * from ie_end(@interva l)
| select b.username, b.vtime, e.vtime, datediff(minute , b.vtime, e.vtime) as duration
| from @beginnings b
| join @ends e
| on b.vtime < e.vtime and b.username = e.username
|Aaugh where
| not exists
| (
| select 1 from @beginnings b1 where b.vtime < b1.vtime and b1.vtime < e.vtime
| )
| and
| not exists
| (
| select 1 from @ends e1 where b.vtime < e1.vtime and e1.vtime < e.vtime
| )
| go

With 5 minutes for @interval, this was typical:

User_one 8/30/2005 1:28 PM 8/30/2005 1:30 PM 2
User_one 8/30/2005 1:36 PM 8/30/2005 1:37 PM 1
User_two 8/26/2005 12:40 PM 8/26/2005 12:42 PM 2
User_two 8/29/2005 6:52 AM 8/29/2005 6:55 AM 3
User_two 8/29/2005 10:34 AM 8/29/2005 10:38 AM 4
User_three 8/30/2005 3:52 PM 8/30/2005 3:59 PM 7
User_three 8/30/2005 4:06 PM 8/30/2005 4:07 PM 1
User_four 8/25/2005 12:17 PM 8/25/2005 12:18 PM 1
User_four 8/25/2005 1:33 PM 8/25/2005 2:02 PM 29
User_four 8/25/2005 2:02 PM 8/25/2005 2:21 PM 19
User_four 8/25/2005 2:28 PM 8/25/2005 2:32 PM 4
User_four 8/25/2005 2:44 PM 8/25/2005 3:27 PM 43
User_four 8/25/2005 4:28 PM 8/25/2005 4:30 PM 2
User_four 8/26/2005 3:17 PM 8/26/2005 3:19 PM 2
User_four 8/30/2005 4:28 PM 8/30/2005 4:29 PM 1

There's a LOT of work to do yet on this. Not bad for starters though.

Thanks again for pulling the cord on this old lawn-mower.

--

Scott
Aug 31 '05 #12
Scott Marquardt opined thusly on Aug 31:
OK, here's the result [...]
| on b.vtime < e.vtime and b.username = e.username
|Aaugh where
| not exists


Pardon that. That was supposed to go into an instant message, not this
post. ;-)

--

Scott
Aug 31 '05 #13
Hugo Kornelis (hugo@pe_NO_rFa ct.in_SPAM_fo) writes:
Need I go on, or do you now have enough examples to know that in the
real world, time does NOT always have start and stsop time.


When I read your post, it was quite clear that there was something
fundamentally wrong with it, but I could not just put my finger on it.

Until I came to this last paragraph. You are seriously trying to refer
to real world in an argument with Joe Celko?
--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Aug 31 '05 #14
>> The poststamp on the envelope is stamped by the Dutch postal service. This stamp includes a date: "22 VIII 05". Please
don't tell me that this means that they started stamping it on midnight
and took a full 24 hours before the stamping was done. <<

Do you really use that format?? I thought that Roman Numeral dates
went out with the NATO Standards under De Gaul. My age is showing.

For legal purposes in the US, that postmark would be the duration
('2005-08-22 00:00:00' to '2005-08-22 23:59:59.9999.. )
How about police work? An officer is checking the streets, and at 11:47 AM he sees you driving through a red light. What's the start and stop times of that? <<


That deals with rounding errors and precision. The way I drive, it
means that the cop did not have a watch that goes to microseconds :)
The conceptual model is that it took some time for me to go thru the
intersection at 100 MPH.

I am starting to like the MySQL convention of 'yyyy-mm-00' for a whole
month range and ''yyy-00-00' for a whole year range, but I have trouble
with '0000-00-00' for eternity.

Aug 31 '05 #15

"--CELKO--" <jc*******@eart hlink.net> wrote in message
news:11******** **************@ g49g2000cwa.goo glegroups.com.. .
you must be aware that web requests, like most other events in
computing, are not ever logged as durations, but as instants.<<
Do not confuse the recording of the data with the data model.
No one is.
Think
about a sign-in and sign-out sheet or timeclock. Each line is "half a
fact"; the whole fact is the duration spent on the job. In this case,
the user logs onto a site, stays there for x-minutes. He is not there
for a Chronon (that is the term for a point in time in temporal
databases).
What you're missing is most website log information is stateless. The
concept of a "duration" doesn't necessarily exist with webpages.

I.e you go to www.google.com and get a page.

Google records you requested a page. They have no idea how long you look at
it. You could get up, go have lunch, go for a walk, etc.

Shut down your computer, go to a different site, etc.

Celko, I suggest you go look at the logs of a webserver sometime. A
webpage is recorded as an instant in time.

Yes, one can try to model a visitors travel through a site, but one is not
necessarily modelling reality. They may pull up a page, go away for 5
minutes, and hit a link.

From that you can derive a "duration" they were on that page, but not
necessarily.

As I said, they could close their browser. You record no duration.
They could click a link to another site, nothing gets recorded in your logs.
Again, no duration.
They could simply type in a different URL, nothing gets recorded in your
logs. Again, no duration.
Or, they could go out for lunch, and come back and hit another page on your
site. But does that really mean that they spent a duration of an hour on
your site? Not raelly.

So his table MIGHT look like this:

CREATE TABLE Browsing
(user_id VARCHAR(30) NOT NULL,
website VARCHAR(255) NOT NULL,
login_time TIMESTAMP DEFAULT CURRENT_TIMESTA MP NOT NULL,
logout_time TIMESTAMP, -- null means still there
CHECK (login_time <= logout_time), -- less than one minute problem
PRIMARY KEY (user_id, website, login_time));

The real problem in this situaiton is having to round to the minute.
We can force a convention on the stopping time to keep it away from the
starting point by a bit less than one munute -- ('yyyy-mm-ddThh:mm:00'
to 'yyyy-mm-ddThh:mm:59.997 ').
Ever read the paradoxes of Zeno? He went thru what happens when you
believe in Chronons. He lived in a time when Greek math was "integer
only" and without a continuum.

Sep 1 '05 #16
On Wed, 31 Aug 2005 16:49:20 -0500, Scott Marquardt wrote:
Scott Marquardt opined thusly on Aug 31:
OK, here's the result [...]

| on b.vtime < e.vtime and b.username = e.username
|Aaugh where
| not exists


Pardon that. That was supposed to go into an instant message, not this
post. ;-)


But it's so appropriate! Reminiscent of the required "PLEASE" statements
in INTERCAL (Language Without A Good Acronym).
Sep 1 '05 #17
I did not say that very well. The model is that the traffic light was
modeled by Lights (color, start_time, end_time). and that I was in the
intersection with ('red', 2005-09-01 12:00:00', '2005-09-01
12:20:00'). I need to talk to the city about a traffic light with a 20
minute duration.

Sep 1 '05 #18
Stu
Joe,

That's a very valid observation about the seperation of the data model
from the data recorded; there are two logistical problems raised by
this, however:

1. The tools that collect web traffic information (typically firewall
syslog) are at best a proxy measure for web usage; what they really
capture is connection information. The firewall doesn't know what
happens to a packet when it passes by; all it can say is that at this
instant (Chronon; neat term), a packet passed through the firewall from
one computer to the next.

Many people use this information to try and gather web usage, but it's
an imperfect model. There is no accurate way (as Scott said earlier)
to indicate how much time a person actually spent interacting with a
web site. All that can be said for certain is that a packet passed
from a machine that's associated with that user to the Internet at x
time.

2. The other logistical problem that is raised is the issue of
multitasking. Right now, I have 6 browser tabs open (in Firefox on a
multi-monitor system); I switch back and forth looking at different
information. How much time am I spending on a website? It can't be
measured looking at syslog data because every time I interact with a
different web page, a connection event gets recorded. Any report that I
run trying to decipher my web behavior would be nearly impossible to
interpret (e.g., Stu went to CNN then to Google Groups back to CNN, to
email, back to CNN, and then to Google Groups. All in the span of a
few seconds).
Stu

Sep 1 '05 #19
"Time is what keeps everythign from happening at once!" -- George
Karlin.

Yes, parallelism is a bitch. I'd handle the multitasking by modeling
the session/connection rather than the user, then attach the user to
each of those. But the truth is that things were in durations -- maybe
short ones (Aunt Mabel's photos) or long ones (disgusting_por no.com)
and we have the problem of only being able to catch "half a fact".

I did a schema for a company that does a electronic timeclock system
that recorded nothing but the id of a fob and a UTC time. The fobs are
color coded for each job and you touch them to the mil spec timeclock
-- it looks like a pad lock that can take a bullet at point blank
range. If you think about it, you can do a lot with a minimal amount
of data.

Sep 1 '05 #20

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
4338
by: tom | last post by:
I am new to SQL administration. >From a list of IDs that are the primary key in one table (i.e. Customer Table), I want to make changes in tables that use those IDs as a foreign key. Basically I want to say: If fk_ID is in list then do these statements to that record
0
277
by: Dale Ring | last post by:
Sorry about the dup post, somehow my original post was put on an existing thread instead of a new one. Access 2000 I am working with a service call database and would like to create a query that looks for repeat trips within a user defined time period in minutes. The affected fields are CustID, CallDate, TimeComplete, TimeReceived
0
1224
by: Dale Ring | last post by:
Sorry about the dup post, somehow my original post was put on an existing thread instead of a new one. Access 2000 I am working with a service call database and would like to create a query that looks for repeat trips within a user defined time period in minutes. The affected fields are CustID, CallDate, TimeComplete, TimeReceived
7
2400
by: Trevor Best | last post by:
I have an import routine that takes a generic file (Excel, CSV, Fixed length, PDMS BOM, XML, etc) and maps fields to a temp import table on the server, in the field mapping there may be functions specified so the user may put in "Mid([DocNumber,12,3)" as a value to go into FieldX. I've tried 3 methods of getting the data from a linked table (CSV file) to the SQL Server, each having some drawbacks. Method 1: DAO, pokes each record in...
4
1815
by: Lyle Fairfield | last post by:
This takes about 2 seconds on my rather obsolete machine: Option Explicit ' Test is a simple JET Table with four fields ' ID -> autonumber primary key ' Field1 -> Integer (maps to VBA long) ' Field2 -> Text (50) ' Field3 -> DateTime
0
1188
by: lcrwebmaster | last post by:
PLEASE HELP!!!! Needing help with developing a stored proc for the following: Production crews work at the following times: CREW A = 7:30AM to 3:30PM CREW B = 3:30PM to 11:30PM CREW C = 11:30PM to 7:30AM - There are Record_Begin_Date and Record_End_Date time/date fields in the database.
3
8086
by: wvmbark | last post by:
First time poster... I just found this forum and it appears there's plenty of people here that could make short work of problem that's been driving me absolutely bonkers for months. Every day we incur numerous service problems “Events”. Each morning we have a global conference call where events which occurred within the previous 24 hours are discussed. Prior to the call, an analyst has to review these events and provide a report, ‘The Morning...
4
2436
by: rn5a | last post by:
A MS-Access DB has 3 tables - Teacher, Class & TeacherClass. The Teacher table has 2 columns - TeacherID & TeacherName (TeacherID being the primary key). The Class table too has 2 columns - ClassID & ClassName (ClassID being the primary key) The TeacherClass table has 3 columns - TCID (AutoNumber), TeacherID & ClassID. One teacher can teach multiple classes & one class can be taught by multiple teachers. The TeacherClass table basically...
6
24765
markrawlingson
by: markrawlingson | last post by:
Hopefully someone can help me out with this, it's driving me nuts... I have two tables - We'll call them table1 and table2. Table1 holds event information, and table2 holds user registration data pertaining to each event in table1. So for each record in Table1 there could be hundreds of records pertaining to that record in Table2 - I am trying to count those records (to reveal the number of people registered for the event held in table1) I...
0
9645
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9480
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10147
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
8972
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7499
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6739
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5381
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
4050
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
2879
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.