472,095 Members | 2,490 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,095 software developers and data experts.

How to setup a MySQL Table to keep track of stats?

I have a new system Im building that stores entries of what people are
searching for on my sites.

I want to be able to keep records of how many times a keyword was
searched for daily, and from that I can calculate weekly and monthly.

At this point I have one entry per search phrase with the number of
hits the search phrase has gotten, and the last time it was updated.

As I start to take the program out of testing and move in more into a
used tool Im getting worried that my idea of switching to mutable
entries for the same search phrase would be a bad idea as within the
last 15 days I have stored more then 3 million unique search phrases,
and a unknown number of hits.

So should I make a rolling database that stores each search then rolls
that data over to another database once a week that users can use, or
is their a better way of doing it where I only keep one entry per
search phrase and am still able to keep records how daily search
amounts so that I can track trends, etc in search phases on my sites?

My plan to have a 3-5 month window that I can look at data on search
information.

Aug 16 '06 #1
15 6575
>I want to be able to keep records of how many times a keyword was
>searched for daily, and from that I can calculate weekly and monthly.

At this point I have one entry per search phrase with the number of
hits the search phrase has gotten, and the last time it was updated.

As I start to take the program out of testing and move in more into a
used tool Im getting worried that my idea of switching to mutable
entries for the same search phrase would be a bad idea as within the
last 15 days I have stored more then 3 million unique search phrases,
and a unknown number of hits.

So should I make a rolling database that stores each search then rolls
that data over to another database once a week that users can use, or
is their a better way of doing it where I only keep one entry per
search phrase and am still able to keep records how daily search
amounts so that I can track trends, etc in search phases on my sites?

My plan to have a 3-5 month window that I can look at data on search
information.
This depends a lot on what kind of statistics you want.

Think about how an odometer works. The count for a given car just
keeps going up and up and up (hopefully you use a data type big
enough so you avoid "rollover"). You get the counts in a given
period by subtracting the readings at the beginning and the end of
the period.

So, if you take snapshots of the active count table each month or
week or whatever, you can get counts for any month. You could
duplicate the whole table, or have one table with count entries
for current, Jan 1, Feb 1, Mar 1, ... .
Aug 17 '06 #2
I ran into a guy in IRC that helped me get it worked out.

This is what we came up with

Table 1:
ID | Keyword

Tabl 2:
ID | Count | DateTime

If it is a new search we will store the search phrase into table 1, and
then add one hit to table 2. If its the same search on the same date
then we will add to the count in table 2, otherwise we add a new entry
for a new day and start the count over.

Then we will use replication (Im learning about that) to copy it to
another DB for front end usage.

Im going to use BigInt(20), DateTime, and VarChar(80) on the above
tables. Dont know if DateTime is best for tracking days or not, but it
has worked well in the past.

Sound good?

Gordon Burditt wrote:
I want to be able to keep records of how many times a keyword was
searched for daily, and from that I can calculate weekly and monthly.

At this point I have one entry per search phrase with the number of
hits the search phrase has gotten, and the last time it was updated.

As I start to take the program out of testing and move in more into a
used tool Im getting worried that my idea of switching to mutable
entries for the same search phrase would be a bad idea as within the
last 15 days I have stored more then 3 million unique search phrases,
and a unknown number of hits.

So should I make a rolling database that stores each search then rolls
that data over to another database once a week that users can use, or
is their a better way of doing it where I only keep one entry per
search phrase and am still able to keep records how daily search
amounts so that I can track trends, etc in search phases on my sites?

My plan to have a 3-5 month window that I can look at data on search
information.

This depends a lot on what kind of statistics you want.

Think about how an odometer works. The count for a given car just
keeps going up and up and up (hopefully you use a data type big
enough so you avoid "rollover"). You get the counts in a given
period by subtracting the readings at the beginning and the end of
the period.

So, if you take snapshots of the active count table each month or
week or whatever, you can get counts for any month. You could
duplicate the whole table, or have one table with count entries
for current, Jan 1, Feb 1, Mar 1, ... .
Aug 18 '06 #3
l3vi wrote:
I ran into a guy in IRC that helped me get it worked out.

This is what we came up with

Table 1:
ID | Keyword

Tabl 2:
ID | Count | DateTime

If it is a new search we will store the search phrase into table 1, and
then add one hit to table 2. If its the same search on the same date
then we will add to the count in table 2, otherwise we add a new entry
for a new day and start the count over.

Then we will use replication (Im learning about that) to copy it to
another DB for front end usage.
explodeIm going to use BigInt(20), DateTime, and VarChar(80) on the
above
tables. Dont know if DateTime is best for tracking days or not, but it
has worked well in the past.

Sound good?

Gordon Burditt wrote:
>I want to be able to keep records of how many times a keyword was
>searched for daily, and from that I can calculate weekly and monthly.
>
>At this point I have one entry per search phrase with the number of
>hits the search phrase has gotten, and the last time it was updated.
>
>As I start to take the program out of testing and move in more into a
>used tool Im getting worried that my idea of switching to mutable
>entries for the same search phrase would be a bad idea as within the
>last 15 days I have stored more then 3 million unique search phrases,
>and a unknown number of hits.
>
>So should I make a rolling database that stores each search then rolls
>that data over to another database once a week that users can use, or
>is their a better way of doing it where I only keep one entry per
>search phrase and am still able to keep records how daily search
>amounts so that I can track trends, etc in search phases on my sites?
>
>My plan to have a 3-5 month window that I can look at data on search
>information.
This depends a lot on what kind of statistics you want.

Think about how an odometer works. The count for a given car just
keeps going up and up and up (hopefully you use a data type big
enough so you avoid "rollover"). You get the counts in a given
period by subtracting the readings at the beginning and the end of
the period.

So, if you take snapshots of the active count table each month or
week or whatever, you can get counts for any month. You could
duplicate the whole table, or have one table with count entries
for current, Jan 1, Feb 1, Mar 1, ... .
I am assuming you are using PHP
Here what i would do
Split search string using explode()
Remove noise words such as the,my,are,and,.when ,so,of,there,etc
see http://drupal.org/node/1202
You can modify word list to suit.
Store remaining word in database as suggested
You may find functions such as metaphone() and levenshtein() useful
see http://www.php.net/manual/en/ref.strings.php for PHP
ASP and other languages have similar functions

Aug 19 '06 #4
"l3vi" <ad***@boogybonbon.comwrote in message
news:11**********************@i3g2000cwc.googlegro ups.com...
I have a new system Im building that stores entries of what people are
searching for on my sites.

I want to be able to keep records of how many times a keyword was
searched for daily, and from that I can calculate weekly and monthly.

At this point I have one entry per search phrase with the number of
hits the search phrase has gotten, and the last time it was updated.

As I start to take the program out of testing and move in more into a
used tool Im getting worried that my idea of switching to mutable
entries for the same search phrase would be a bad idea as within the
last 15 days I have stored more then 3 million unique search phrases,
and a unknown number of hits.

So should I make a rolling database that stores each search then rolls
that data over to another database once a week that users can use, or
is their a better way of doing it where I only keep one entry per
search phrase and am still able to keep records how daily search
amounts so that I can track trends, etc in search phases on my sites?

My plan to have a 3-5 month window that I can look at data on search
information.
Try:

CREATE TABLE `searches` (
`search_term` char(255) NOT NULL default '',
`search_date` date default NULL,
`search_count` bigint(20) unsigned NOT NULL default '1',
PRIMARY KEY (`search_term`, `search_date`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
INSERT is now:

$query = "INSERT INTO searches (search_term, search_date) VALUES
('$search_term', CURRENT_DATE()) ON DUPLICATE KEY UPDATE search_count =
search_count + 1";
....this will give you nothing but unique entries for any search term on any
particular date and their counts. You can then create your backups with
something like:

CREATE TABLE table_name (PRIMARY KEY(search_term, search_date)) SELECT *
FROM searches WHERE search_date = '????-??-??'


Norm
Aug 20 '06 #5


--
FREE Avatar hosting at www.easyavatar.com
"l3vi" <ad***@boogybonbon.comwrote in message
news:11**********************@i3g2000cwc.googlegro ups.com...
I have a new system Im building that stores entries of what people are
searching for on my sites.

I want to be able to keep records of how many times a keyword was
searched for daily, and from that I can calculate weekly and monthly.

At this point I have one entry per search phrase with the number of
hits the search phrase has gotten, and the last time it was updated.

As I start to take the program out of testing and move in more into a
used tool Im getting worried that my idea of switching to mutable
entries for the same search phrase would be a bad idea as within the
last 15 days I have stored more then 3 million unique search phrases,
and a unknown number of hits.

So should I make a rolling database that stores each search then rolls
that data over to another database once a week that users can use, or
is their a better way of doing it where I only keep one entry per
search phrase and am still able to keep records how daily search
amounts so that I can track trends, etc in search phases on my sites?

My plan to have a 3-5 month window that I can look at data on search
information.
update---

----- Original Message -----
From: "l3vi" <ad***@boogybonbon.com>
Newsgroups: mailing.database.mysql
Sent: Tuesday, August 15, 2006 10:06 PM
Subject: How to setup a MySQL Table to keep track of stats?

I have a new system Im building that stores entries of what people are
searching for on my sites.

I want to be able to keep records of how many times a keyword was
searched for daily, and from that I can calculate weekly and monthly.

At this point I have one entry per search phrase with the number of
hits the search phrase has gotten, and the last time it was updated.

As I start to take the program out of testing and move in more into a
used tool Im getting worried that my idea of switching to mutable
entries for the same search phrase would be a bad idea as within the
last 15 days I have stored more then 3 million unique search phrases,
and a unknown number of hits.

So should I make a rolling database that stores each search then rolls
that data over to another database once a week that users can use, or
is their a better way of doing it where I only keep one entry per
search phrase and am still able to keep records how daily search
amounts so that I can track trends, etc in search phases on my sites?

My plan to have a 3-5 month window that I can look at data on search
information.
Try:

CREATE TABLE `searches` (
`search_term` char(255) NOT NULL default '',
`search_date` date default NULL,
`search_count` bigint(20) unsigned NOT NULL default '1',
PRIMARY KEY (`search_term`, `search_date`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
INSERT is now:

$query = "INSERT INTO searches (search_term, search_date) VALUES
('$search_term', CURRENT_DATE()) ON DUPLICATE KEY UPDATE search_count =
search_count + 1";
....this will give you nothing but unique entries for any search term on any
particular date and their counts. You can then create your backups with
something like:

$table_name = 'searches_'.date('Ymd'); // yearmonthday
$backup_date = date('Y-m-d'); // year-month-day

$query = "CREATE TABLE $table_name (PRIMARY KEY(search_term, search_date))
SELECT *
FROM searches WHERE search_date = '$backup_date'";


Norm


Aug 20 '06 #6


--
FREE Avatar hosting at www.easyavatar.com
"l3vi" <ad***@boogybonbon.comwrote in message
news:11**********************@i3g2000cwc.googlegro ups.com...
I have a new system Im building that stores entries of what people are
searching for on my sites.

I want to be able to keep records of how many times a keyword was
searched for daily, and from that I can calculate weekly and monthly.

At this point I have one entry per search phrase with the number of
hits the search phrase has gotten, and the last time it was updated.

As I start to take the program out of testing and move in more into a
used tool Im getting worried that my idea of switching to mutable
entries for the same search phrase would be a bad idea as within the
last 15 days I have stored more then 3 million unique search phrases,
and a unknown number of hits.

So should I make a rolling database that stores each search then rolls
that data over to another database once a week that users can use, or
is their a better way of doing it where I only keep one entry per
search phrase and am still able to keep records how daily search
amounts so that I can track trends, etc in search phases on my sites?

My plan to have a 3-5 month window that I can look at data on search
information.
Or even:

....this will give you nothing but unique entries for any search term on any
particular date and their counts. You can then create your backups with
something like:

$table_name = 'searches_'.date('md'); // yearmonthday
$backup_date = date('m'); // month

$query = "CREATE TABLE $table_name (PRIMARY KEY(search_term, search_date))
SELECT *
FROM searches WHERE search_date LIKE '2006-%$backup_date%'";

For a one month backup out of 2006

-or-
....this will give you nothing but unique entries for any search term on any
particular date and their counts. You can then create your backups with
something like:

$year = ???; // four digit year;
$start_month = ???; // two digit month, ex: 06
$end_month = ???; // two digit month, ex: 08
$table_name = "searches_$year_months_$start_month_through_$end_m onth";

$query = "CREATE TABLE $table_name (PRIMARY KEY(search_term, search_date))
SELECT * FROM searches WHERE search_date >= '$year-$start_month-01' AND
search_date <= '$year-$end_month-31'";

For a three month backup, you get the idea. :)
Norm
Aug 20 '06 #7
This is what I went with:

CREATE TABLE `keyword_count` (
`Num` bigint(20) NOT NULL auto_increment,
`ID` bigint(20) NOT NULL default '0',
`Count` int(20) NOT NULL default '0',
`Date` datetime NOT NULL default '0000-00-00 00:00:00',
PRIMARY KEY (`Num`),
KEY `ID` (`ID`),
KEY `Date` (`Date`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

CREATE TABLE `keyword_lexicon` (
`ID` bigint(20) NOT NULL auto_increment,
`Keyword` varchar(80) NOT NULL default '',
PRIMARY KEY (`ID`),
KEY `Keyword` (`Keyword`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
>From this I am able to search by any days I want, without doing backups
and I am able to remove entries that have expire over x amount of time.
Ill prob being doing a DB dump and import to another db once a day so
that the backend processes are not conflicting with users processes.

Right now Im working on what chars are blocked, and what chars are
replaced.. < That looks like it will be a ongoing event. LOL

Aug 20 '06 #8
I should note that the keyword field is set to unique, but it not
listed in the above sql.

I do have one problem that i can not figure out.

Before I insert or update anything I run:

SELECT ID FROM keyword_lexicon WHERE Keyword LIKE ? LIMIT 1

I have been getting back that it did not find a match, but when it goes
to insert I get Duplicate entry 'how to find some keywords' for key 2.

I check the DB and it is dupe, but what I dont get is why the first sql
does not catch it even with = or LIKE.

Aug 20 '06 #9
"l3vi" <ad***@boogybonbon.comwrote in message
news:11**********************@m79g2000cwm.googlegr oups.com...
I should note that the keyword field is set to unique, but it not
listed in the above sql.

I do have one problem that i can not figure out.

Before I insert or update anything I run:

SELECT ID FROM keyword_lexicon WHERE Keyword LIKE ? LIMIT 1

I have been getting back that it did not find a match, but when it goes
to insert I get Duplicate entry 'how to find some keywords' for key 2.

I check the DB and it is dupe, but what I dont get is why the first sql
does not catch it even with = or LIKE.
LIKE '%match text%'

But this is what i'm telling you about in the other post... you are doing
something manually that mysql will do for you. Load up my CREATE statement
and test some INSERTS... there are no ID's to look up...

Norm
Aug 20 '06 #10
"l3vi" <ad***@boogybonbon.comwrote in message
news:11**********************@i3g2000cwc.googlegro ups.com...
This is what I went with:

CREATE TABLE `keyword_count` (
`Num` bigint(20) NOT NULL auto_increment,
`ID` bigint(20) NOT NULL default '0',
`Count` int(20) NOT NULL default '0',
`Date` datetime NOT NULL default '0000-00-00 00:00:00',
PRIMARY KEY (`Num`),
KEY `ID` (`ID`),
KEY `Date` (`Date`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

CREATE TABLE `keyword_lexicon` (
`ID` bigint(20) NOT NULL auto_increment,
`Keyword` varchar(80) NOT NULL default '',
PRIMARY KEY (`ID`),
KEY `Keyword` (`Keyword`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
From this I am able to search by any days I want, without doing backups
and I am able to remove entries that have expire over x amount of time.
Ill prob being doing a DB dump and import to another db once a day so
that the backend processes are not conflicting with users processes.
I really think you ar over analyzing this... you pretty much have exactly
what I gave you spread over two tables. Do you really need the DATETIME
column? Especially since your putting an INDEX on it? It's going wreck
havock on the INDEX system. I think you would find the solution I gave will
allow you to perform your desired searches and deletes. Also, I wouldn't use
VARCHAR, just go with a CHAR(80)... it's less work for the db.

Using your 2 tables, what is your pseudo-code to enter a 'keyword' into
the db? How many calls to the db? 1 call to 'keyword_lexicon' see if
something already exists or not, 1 call to insert it into the
'keyword_lexicon' table if not, 1 call to update the 'keyword_count' table.
At least 3 calls to the db or am I missing something? With my example you
need only the 1 INSERT call to perform all the above actions (mysql does it
for you). ;)
Right now Im working on what chars are blocked, and what chars are
replaced.. < That looks like it will be a ongoing event. LOL
Whats this???

---
Norm
Aug 20 '06 #11
Sorry for taking so long to get back to you on your post, been crazy
around here. working like a mad man. hehe

That setup really rocks, got time to get a good look at it today and
give it a test run. Ill probably have some more question when I have
more time to actually work with it, but you rock man!

Aug 21 '06 #12
I use fulltext for searching with boolean so I had to change it from
char to varchar so I could do fulltext index.

I must first say your structure is great, as it lets me track by days,
weeks, etc and copy once a day, yesterdays data over to another db for
user processing.

Got any ideas on how to do a percent up/down report? I would like to
show a search term and if its gotten more traffic this the last day or
month then show growth amount. How would I form the queries to get
yesterdays hit count, and today's hit count so that php can do the
math?

A lot of the sites have many keywords beings search for and I would
like to filter out bad words. I have a regex file I put together some
time ago but I have not figured how to remove what I don't want out
of the query while still having paging functionality working properly.

Thanks!

Aug 21 '06 #13
"l3vi" <ad***@boogybonbon.comwrote in message
news:11**********************@h48g2000cwc.googlegr oups.com...
I use fulltext for searching with boolean so I had to change it from
char to varchar so I could do fulltext index.

I must first say your structure is great, as it lets me track by days,
weeks, etc and copy once a day, yesterdays data over to another db for
user processing.

Got any ideas on how to do a percent up/down report? I would like to
show a search term and if its gotten more traffic this the last day or
month then show growth amount. How would I form the queries to get
yesterdays hit count, and today's hit count so that php can do the
math?

A lot of the sites have many keywords beings search for and I would
like to filter out bad words. I have a regex file I put together some
time ago but I have not figured how to remove what I don't want out
of the query while still having paging functionality working properly.

Thanks!
Lets see:

SELECT MIN(keyword_count) AS min_count1, MAX(keyword_count) AS max_count1
WHERE keyword LIKE '%$keyword%' AND keyword_date LIKE '$year-$month-$day'
<-- alter this to your needs

....alter the date and query again for comparison. It should run pretty quick
as both search terms (keyword, keyword_date) actually form the primary keys.
Someone better than me can probably show you how to do this in one query, I
don't know...

Uh... you need to run the 'keyword' string through the 'preg_replace' prior
to your query... something like:

$badwordsarray = array('badword1','badword2',...);
$keyword = preg_replace($badwordsarray, '', $keyword); // replace with empty
string effectively removing them

.... now query...

Norm
Aug 22 '06 #14
Uh... you need to run the 'keyword' string through the 'preg_replace' prior
to your query... something like:

$badwordsarray = array('badword1','badword2',...);
$keyword = preg_replace($badwordsarray, '', $keyword); // replace with empty
string effectively removing them

... now query...

Norm
The keywords are in the db, so I need to filter them coming out, not
in.

ie. I lookup 'picture' and get a punch of porno junk, but I dont
want to see adult words, so I apply my regex file to them, but then
that off sets the count for using link and page count for viewing....
wait.. I may have salved my problem..

Guess I could just query the results, then remove them from the array,
take a count and then run the page count for the results off of that. A
lot of work, but it can be done.

Aug 22 '06 #15
nvm.. I can use AND Keyword REGEXP '' in the query or I can run it in
the background and mark words with another field in the db.

Aug 22 '06 #16

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

19 posts views Thread by Westcoast Sheri | last post: by
4 posts views Thread by Gordon Burditt | last post: by
39 posts views Thread by Mairhtin O'Feannag | last post: by
4 posts views Thread by Patrick Rouse | last post: by
2 posts views Thread by Takeadoe | last post: by
reply views Thread by leo001 | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.