By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
454,409 Members | 1,604 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 454,409 IT Pros & Developers. It's quick & easy.

user identification

P: n/a
Hi all,

We have about 10 different domains that are linked very closely and we
want to identify and track every single user that surfs our websites.
Later we want to analyse user paths and find out the search robots with
the referring search words.

What are the possibilities?
Cookies are not accepted by 40 % of our users and in addition to that
for each domain a different cookie is created what makes it really
complicated.
I guess a combination of Browser type, Operating System, Hostname etc
is really insecure as there are many users using the same stuff.
I think the only secure way to logg this is by the way of using
sessions.

One disadvantage of sessions is that they take very much performance of
the server when there are many users at the same time.
Is there a way to reduce performance of the sessions?
Are there any other possibilities except sessions?
Are there freeware php statistic functions that allow the reuse of
their statistic data in our own implementations?

Thank you very much for your help!

Dennis

Jul 17 '05 #1
Share this Question
Share on Google+
2 Replies


P: n/a
>We have about 10 different domains that are linked very closely and we
want to identify and track every single user that surfs our websites.
Later we want to analyse user paths and find out the search robots with
the referring search words.

What are the possibilities?
Require the users to log in with an ID and password after registration.
That may not be an option. You also probably need sessions to keep
track of the fact that they HAVE logged in.

When the user first hits one of your pages, the URL passed from
many search engines gives you the keywords. This you can get from
Apache logs, or with PHP. Figuring out where the user goes after
that is harder, especially across domains.
Cookies are not accepted by 40 % of our users and in addition to that
for each domain a different cookie is created what makes it really
complicated. I guess a combination of Browser type, Operating System, Hostname etc
is really insecure as there are many users using the same stuff.
I hope 'secure' is *NOT* what you are looking for, and what you are
looking for is more like 'accurate Big Brother watching'. If
'security' is any kind of issue (say, you are a bank or doctor or
auction site), you shouldn't even be thinking about fingerprinting
browsers as a way of identifying users.

Browser fingerprinting is inaccurate for a number of reasons:
(a) load-sharing proxies make requests for even stuff on the SAME
page come from several different IP addresses (and I believe
most AOL users go through one).
(b) If you don't use IP address, there's not enough info to distinguish
the users (what percentage are using IE(latest) with Windows XP?
Over half?)
(c) NAT gateways make lots of really different users appear to come
from the SAME IP. See (b) for why you can't tell them apart.

It might work OK for marketing stats but not for anything requiring
'security'.
I think the only secure way to logg this is by the way of using
sessions.
Well, you've got a problem. To keep track of a session, you need at
least one of (a) cookies, (b) session IDs passed via URL transparently
using trans_sid, (c) session IDs passed explicitly via URL manually,
or (d) hidden form variables on the page.

(a) doesn't work across domains and besides, a lot of your users have
them turned off.
(b) only works with relative URLs (which have to be the same domain).
(c) is a pain in the butt and may offend users for the same reason
they have cookies off,
and (d) works only if every link is a form, and is also a pain in the butt.
I presume that if cookies are often turned off, Javascript is also often
turned off.

You're essentially stuck with (c), with the others sometimes working
as a backup.
One disadvantage of sessions is that they take very much performance of
the server when there are many users at the same time.
It is possible to write session handlers to stuff the session info
into a MySQL database instead of using lots of little files in a
directory. Whether or not this is a performance increase or decrease
depends on your setup and things like how much data gets stuffed
into sessions. Among other things this gets you is the ability to
share session data between different (load-shared) physical servers.
Is there a way to reduce performance of the sessions?
You could always add time-wasting code such as checking whether a
user is logged in several thousand times on each page, but I don't
think this is what you meant to ask.
Are there any other possibilities except sessions?
There are some do-it-yourself methods which essentially re-implement
sessions, often poorly, using the methods (a) through (d) to
keep track of a session ID. Once you can track the session ID,
you can stuff the session data anywhere you need to (files, database,
whatever).
Are there freeware php statistic functions that allow the reuse of
their statistic data in our own implementations?


Probably, but I'm not sure how calculating mean and standard deviation
of something is going to get you the raw data in the first place.

Gordon L. Burditt
Jul 17 '05 #2

P: n/a
On Mon, 30 May 2005 13:40:19 -0700, d.schulz81 wrote:
Hi all,

We have about 10 different domains that are linked very closely and we
want to identify and track every single user that surfs our websites.
Later we want to analyse user paths and find out the search robots with
the referring search words.

What are the possibilities?
Cookies are not accepted by 40 % of our users and in addition to that for
each domain a different cookie is created what makes it really
complicated.
I guess a combination of Browser type, Operating System, Hostname etc is
really insecure as there are many users using the same stuff. I think the
only secure way to logg this is by the way of using sessions.

One disadvantage of sessions is that they take very much performance of
the server when there are many users at the same time. Is there a way to
reduce performance of the sessions? Are there any other possibilities
except sessions? Are there freeware php statistic functions that allow the
reuse of their statistic data in our own implementations?

Thank you very much for your help!

Dennis


This one is based on a mysql table, it records standard information from
$_SERVER[]. You might want to resolve the IPs on entry to the table,
personally I prefer to only look up those that haven't been looked up
before, which I do locally so haven't included here. Also, I moved the
database connect to inside the function, for the sake of those new to this
stuff who want to see how it fits together.

Finally, don't criticise my crappy HTML bits. I have a real battle with
html. Don't bother asking me why because I don't know the answer, if only
we could do php without having to bother at all with the html side I'd be
alright.
The first function is the one to include in each of your pages (it records
the page hits as well as site hits).

<?php
// Visitor hits
//
function VisitorHit()
{
//constants

define('DB_USER','your mysql user name');
define('DB_PASSWORD','your password');
define('DB_HOST','localhost');
define('DB_NAME','database name');

//connect to database

$dbh=mysql_connect (DB_HOST, DB_USER, DB_PASSWORD);
if( ! $dbh) {
die ('I'm terribly sorry but I cannot connect to the database because:
'.mysql_error());
}

//open database

mysql_select_db (DB_NAME);

//select records
$thispage = $_SERVER['PHP_SELF'];
if( strlen($thispage) < 1 )
{
$thispage = "none";
}
$browser = $_SERVER['HTTP_USER_AGENT'];
if( strlen($browser) < 1 )
{
$browser = "none";
}
$ip = $_SERVER['REMOTE_ADDR'];
if( strlen($ip) < 1 )
{
$ip = "none";
}
$requestmethod = $_SERVER['REQUEST_METHOD'];
if( strlen($requestmethod) < 1 )
{
$requestmethod = "none";
}
$querystring = $_SERVER['QUERY_STRING'];
if( strlen($querystring) < 1 )
{
$querystring = "none";
}
$requesturi = $_SERVER['REQUEST_URI'];
if( strlen($requesturi) < 1 )
{
$requesturi = "none";
}
$referer = $_SERVER['HTTP_REFERER'];
if( strlen($referer) < 1 )
{
$referer = "none";
}

// build SQL string

$sql = "INSERT INTO `visitors` ( `visitorid` , `self` , `browser` , `ip` , `requestmethod` , `querystring` , `requesturl` , `referer` , `touched` ) VALUES ('', '$thispage', '$browser', '$ip', '$requestmethod', '$querystring', '$requesturi', '$referer', NOW( ));";

//perform query

$result = mysql_query($sql);

// no checking because we aren't that bothered about missing the odd hit

}

?>

This next part is really an example of how a 'stats.php' file might be
used to examine the data.
<?php
$page_title='Stats';
include 'header.inc';

define('DB_USER','your mysql username');
define('DB_PASSWORD','your password');
define('DB_HOST','localhost');
define('DB_NAME','your database name');

$dbh=mysql_connect (DB_HOST, DB_USER, DB_PASSWORD);
if( ! $dbh) {
die ('I cannot connect to the database because: '.mysql_error());
}

//open database

mysql_select_db (DB_NAME);

//select minimum date record

$sql = "SELECT DATE_FORMAT(MIN(touched),'%D %b %Y') FROM visitors;";
$result = mysql_query($sql);

if($result)
{
$row = mysql_fetch_array($result);
$mindate = $row[0];
}

//select maximum date record
$sql = "SELECT DATE_FORMAT(MAX(touched),'%D %b %Y') FROM visitors;";
$result = mysql_query($sql);

if($result)
{
$row = mysql_fetch_array($result);
$maxdate = $row[0];
}

//DISTINCT visitors

$sql = "SELECT COUNT(DISTINCT(ip)) FROM visitors;";
$result = mysql_query($sql);

if($result)
{
$row = mysql_fetch_array($result);
$distinct = $row[0];
}

//DISTINCT browsers

$sql = "SELECT COUNT(DISTINCT(browser)) FROM visitors;";
$result = mysql_query($sql);

if($result)
{
$row = mysql_fetch_array($result);
$browser = $row[0];
}
?>

<div >
<h2><a name="Stats">Statistics</a></h2>
<?php
echo "<p>Stats between $mindate and $maxdate</p>";
echo "<p>Number of visitors $distinct</p>";
echo "<p>Number of browsers $browser</p>";

//DISTINCT browsers

$sql = "SELECT browser, COUNT(*) AS icount FROM visitors GROUP BY browser
ORDER BY icount DESC;";

$result = mysql_query($sql);

echo "<p>";
if($result)
{
while( $row = mysql_fetch_array($result))
{
printf("%s --- %s<br>", $row[1],$row[0]);
}
}
echo "</p>";

?>

</div>

<?php
include 'footer.inc'
?>
Jul 17 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.