473,508 Members | 2,374 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

how to track unqiue vistors

Hi,

I wonder - how do tools like Google Analytics differ real unique users
from all those millions of bots, crawlers, and so on?
Is it based on IP ranges, or "Javascript=active", or what????

Thanks in advance,

Christine
Dec 11 '07 #1
5 1162
Christine Genzer wrote on 11 dec 2007 in comp.lang.javascript:
I wonder - how do tools like Google Analytics differ real unique users
from all those millions of bots, crawlers, and so on?
Is it based on IP ranges, or "Javascript=active", or what????
Most bots have
bot,crawler,slurp,netcraft,Jeeves, etc
in their HTTP_USER_AGENT header string.

You can never catch them all, as some, like live.com,
also try to mimic normal browsing behavour,
to see if one feeds them different pages as the "normal" folks,
or so I believe, while coming from the same IP range as the same crawler
visit moments before.

--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)
Dec 11 '07 #2
Christine Genzer <Ch**********@web.dewrote:
I wonder - how do tools like Google Analytics differ real unique users
from all those millions of bots, crawlers, and so on?
Is it based on IP ranges, or "Javascript=active", or what????
Google analytics and similar use javascript to fetch a url from the
analytics server. Crawlers don't execute javascript so they never touch the
analytics server.

Likewise people who use adblockers to block the analytics server or
doubleclick or whatever are invisible to those particular services.

In addition you can configure analytics to ignore particular IP addresses
which is a handy way to exclude your own hits from the reports.
Dec 11 '07 #3
Duncan Booth schrieb:
Christine Genzer <Ch**********@web.dewrote:
>I wonder - how do tools like Google Analytics differ real unique users
from all those millions of bots, crawlers, and so on?
Is it based on IP ranges, or "Javascript=active", or what????
Google analytics and similar use javascript to fetch a url from the
analytics server. Crawlers don't execute javascript so they never touch the
analytics server.
Well, this was my idea too, but is that really true? Everyone says that
crawlers and bots could not execute javascript - but why?! If a browser
can do it, why shouldn't a bot be able to do the same? The harvesters
and spam bots get more and more intelligent I guess - a lot of email
adresses are "hidden" by Javascript - so wouldn't a good spam bot
also execute Javascript? Javascript is source code, instructions just
like html and so on - why could this not be executed?!

Thanks in advance, Christine
Dec 12 '07 #4
Christine Genzer <Ch**********@web.dewrote:
Duncan Booth schrieb:
>Christine Genzer <Ch**********@web.dewrote:
>>I wonder - how do tools like Google Analytics differ real unique
users from all those millions of bots, crawlers, and so on?
Is it based on IP ranges, or "Javascript=active", or what????
Google analytics and similar use javascript to fetch a url from the
analytics server. Crawlers don't execute javascript so they never
touch the analytics server.

Well, this was my idea too, but is that really true? Everyone says
that crawlers and bots could not execute javascript - but why?! If a
browser can do it, why shouldn't a bot be able to do the same? The
harvesters and spam bots get more and more intelligent I guess - a lot
of email adresses are "hidden" by Javascript - so wouldn't a good spam
bot also execute Javascript? Javascript is source code, instructions
just like html and so on - why could this not be executed?!
It isn't that they 'could not', rather in general they 'do not'. It's
quite easy to write a bot which executes all javascript: just drive IE
through its automation interface, but it will be much slower and doesn't
gain you much.

If your crawler is something like Google then you want to respect
people's wishes, so there isn't any point looking for hard-to-find
pages: if someone wants you to find all their pages they can create a
sitemap with non-javascript links.

If your crawler is a harvester looking for email addresses then you'll
get more by crawling fast and missing some than by crawling slowly to
grab them all. Perhaps this will change in the future as more sites
obfuscate email addresses. You might find this article of interest:
http://nadeausoftware.com/articles/2...esses_spammers

Also, even if you did decide to execute Javascript, making a bot
specifically exclude the common analytics sites would make sense: why
advertise what you are doing?

Dec 12 '07 #5
Christine Genzer wrote:
Duncan Booth schrieb:
>Christine Genzer <Ch**********@web.dewrote:
>>I wonder - how do tools like Google Analytics differ real unique users
from all those millions of bots, crawlers, and so on?
Is it based on IP ranges, or "Javascript=active", or what????

Google analytics and similar use javascript to fetch a url from the
analytics server. Crawlers don't execute javascript so they never touch the
analytics server.

Well, this was my idea too, but is that really true? Everyone says that
crawlers and bots could not execute javascript - but why?! If a browser
can do it, why shouldn't a bot be able to do the same?
For simple code, it would need not only a script parser but also a script
engine, i.e. an ECMAScript implementation. For more complex code, it would
also need to implement a AOM/DOM and an engine for that, too. That would by
far outweigh the expected gain of finding URLs in script code. Reasonable
authors also write Web sites that degrade gracefully, so there is not really
a need to crawl scripts.

Please don't multi-post to several JavaScript newsgroups, and have your
keyboard repaired. Thanks in advance.
PointedEars
--
Prototype.js was written by people who don't know javascript for people
who don't know javascript. People who don't know javascript are not
the best source of advice on designing systems that use javascript.
-- Richard Cornford, cljs, <f8*******************@news.demon.co.uk>
Dec 12 '07 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
3068
by: Sandman | last post by:
Just looking for suggestion on how to do this in my Web application. The goal is to keep track of what a user has and hasn't read and present him or her with new material I am currently doing...
1
1443
by: shank | last post by:
Imagine a CD of say 20 songs Each song is a table row I need a way for the user to change the way the songs are listed by allowing them to change the track numbers. What is the smartest...
4
1949
by: Patrick Rouse | last post by:
Please point me to the correct newsgroup if this is the wrong place to post this question. My website is written in simple HTML and hosted on a windows server at secureserver.net (via GoDaddy). ...
3
4398
by: johnny | last post by:
hi all! I am starting to study the best way to track site visitors. Logfiles stats which come with every web hosting, have little metrics to be analyzed and also problems with cached pages which...
0
1923
by: ateale | last post by:
Hi guys! I am having a bit of difficulty trying to get to a 'timecode' track in a QuickTime file using the QTKit framework in Mac OSX 10.5 I am using XCode 3. I have a QTMovie object (set as...
0
7224
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7380
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
7039
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
5626
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
4706
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
3192
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
3180
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
763
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
415
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.