473,569 Members | 2,783 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Best way to do so.

Sorry if I post here, but I don't have access to any valuable mysql ng for
this question, ans since 99% or php programmers deal with mysql, I'm sure
I'll get an answer here.

Look at the folloming table:
CREATE TABLE searchlog (
IDSearchLog MEDIUMINT UNSIGNED NOT NULL auto_increment,
F1 MEDIUMINT UNSIGNED default NULL,
F2 VARCHAR(40) default NULL,
F3a SMALLINT UNSIGNED default NULL,
F3b SMALLINT UNSIGNED default NULL,
F4a FLOAT(11,2) UNSIGNED default NULL,
F4b FLOAT(11,2) UNSIGNED default NULL,
F5a MEDIUMINT UNSIGNED default NULL,
F5b MEDIUMINT UNSIGNED default NULL,
F6 char(8) default NULL,
F7 char(8) default NULL,
F8 char(8) default NULL,
F9a TINYINT UNSIGNED default NULL,
F9b TINYINT UNSIGNED default NULL,
F10a FLOAT(6,2) UNSIGNED default NULL,
F10b FLOAT(6,2) UNSIGNED default NULL,
F11a FLOAT(6,2) UNSIGNED default NULL,
F11b FLOAT(6,2) UNSIGNED default NULL,
F12 datetime NOT NULL,
PRIMARY KEY (IDSearchLog),
UNIQUE KEY IDSearchLog (IDSearchLog),
INDEX AI_IDSearchLog (IDSearchLog)
);

Now this will be used for managing searchlog. Any time a search is done on
some fields, then I must log them.
For the long time, I think that they may be about 5-600 different search
type. Many search would be quite similar.
Now, for performances reasons I've 3 possibilities:
1) I may check if the same search is done then log only the datetime and the
IDSearchLog. If the same search doesn't exist, then I create it.
2) I may create a new entry on every search: I won't have to check if the
search exists, but I will have a lot of similar records. This will rise the
amount of data, but isn't quicker ? I avoid a check at every search.
3) I may create many tables. If you have a close look at the field's name
(explicitely changed for better understanding), any number with a or b next
to it means that you may define min and max. So for fields F3a and F3b is a
min and max value. So the limits combination may be quite common between
searches. So in my searchlog table, I may only have pointers to the
differents combinations. This seems very heavy in query time, but it let me
gain some space. Maybe I'm missing a quick way to create a new combination
every time, and return the existing ID if available instead of creating it.

After thinking at what I've, the first choice would have my preference. But
I'm scared that the search for existing record with the same fields would
slow down a lot, and it doesn't worth the gained space against choice 2.
Maybe there is a simple way to add some fields in a query, and if the fields
combination already exists, return the ID.
Some search example:
Adidas, shoes, from 150$, to 250$, blue
Nike, socks, to 50$
Adidas, socks, to 60$
Nike, socks, to 60$
Adidas shoes, to 250$
Puma shoes, red
Puma shoes, yellow
.....
I've to log such fields in order to be as fast as possible, but also save
space.

Please help.

Bob
Jul 17 '05 #1
10 1842
Bob Bedford wrote:
Sorry if I post here, but I don't have access to any valuable mysql ng
for this question, ans since 99% or php programmers deal with mysql, I'm
sure I'll get an answer here.

Look at the folloming table:
CREATE TABLE searchlog (
IDSearchLog MEDIUMINT UNSIGNED NOT NULL auto_increment,
F1 MEDIUMINT UNSIGNED default NULL,
F2 VARCHAR(40) default NULL,
F3a SMALLINT UNSIGNED default NULL,
F3b SMALLINT UNSIGNED default NULL,
F4a FLOAT(11,2) UNSIGNED default NULL,
F4b FLOAT(11,2) UNSIGNED default NULL,
F5a MEDIUMINT UNSIGNED default NULL,
F5b MEDIUMINT UNSIGNED default NULL,
F6 char(8) default NULL,
F7 char(8) default NULL,
F8 char(8) default NULL,
F9a TINYINT UNSIGNED default NULL,
F9b TINYINT UNSIGNED default NULL,
F10a FLOAT(6,2) UNSIGNED default NULL,
F10b FLOAT(6,2) UNSIGNED default NULL,
F11a FLOAT(6,2) UNSIGNED default NULL,
F11b FLOAT(6,2) UNSIGNED default NULL,
F12 datetime NOT NULL,
PRIMARY KEY (IDSearchLog),
UNIQUE KEY IDSearchLog (IDSearchLog),
INDEX AI_IDSearchLog (IDSearchLog)
);


You don't need the UNIQUE KEY clause because the manual says:

14.2.6 CREATE TABLE Syntax

A PRIMARY KEY is a unique KEY where all key columns must be defined as
NOT NULL.

<snip>

As for everything else, I have no idea what you are asking.

NM

--
convert uppercase WORDS to single keystrokes to reply
Jul 17 '05 #2

"Bob Bedford" <be******@YouKn owWhatToDoHereh otmail.com> wrote
Now this will be used for managing searchlog. Any time a search is done on
some fields, then I must log them.
For the long time, I think that they may be about 5-600 different search
type. Many search would be quite similar.


For a log, just record the raw event. Don't try and get fancy. Doing so
may get you in trouble later when you want to mine the data in some way you
haven't even considered.

Try and be relational: use the part id or style/color/size id's from your
item master tables.

I'd also add timestamp, source, and session_id fields to each row.

Don't design to minimize physical storage. Design for maximum flexibility.

How many rows will your app be inserting a day, anyway? You could purge the
old records to a history table every month if it got really nuts.

I designed a large inventory application which accumulated about a million
transactions a month, and we kept a years worth on-line without any
particular problems (this was Oracle on an HP box). It's hard to imagine
you'll get anything like that volume of transactions.

Just my 2 cents off the top of my head, not fully understanding your
application...

-- Dan
Jul 17 '05 #3

First, thanks for replying.

I'll explain differently what I'm trying to do.
I've to store all possible data for informing the companies that signed with
our website what the people choice are.
Actually I took example of the shoes, for simplify. The real items in the
site are used cars.
If they are 80% of the customers that are looking for Chevrolet, then the
car dealer may buy Chevrolet instead of BMW for their customers. If they
know that 90% are looking for a car between 5000 and 8000$, they would try
to provide this kind of car. So, everytime somebody make a search, I may log
what he was looking for, then build statistics for the car dealers. Since
not all fields are compulsory, a customer may chose the price, an other the
color, the other one just the make. I want to optimize it as much as
possible but keeping it flexible for doing any kind of statistics.
For a log, just record the raw event. Don't try and get fancy. Doing so
may get you in trouble later when you want to mine the data in some way
you haven't even considered. When you say I've to log the row event, I don't know what you mean. My form
has many fields, and I must save only the used ones, not the ones left
empty. I'll use the part ID, but how to store any price if they are filled
by the customer itself ? I don't provide any preset value for price.
Try and be relational: use the part id or style/color/size id's from your
item master tables. Obviously I'll do this way. I'll link to other tables.

I'd also add timestamp, source, and session_id fields to each row. What you mean for "source" to each row ?
Why should I store session_id ? this will let me know if the same person
looked at different articles, but since they are too many search, it would
be hard to get an interesting result. Maybe I may say: people looking for a
BMW are also looking for an AUDI.
Don't design to minimize physical storage. Design for maximum
flexibility. My problem is saving "time". I mean if my search makes 5 seconds because of
bad DB design or other, it may be good for trashbin. I want maximum speed
and flexibility.
How many rows will your app be inserting a day, anyway? You could purge
the old records to a history table every month if it got really nuts. I won't purge records. Maybe it wouldn't be interesting to have records 3
years old, but the records may be kept.
I designed a large inventory application which accumulated about a million
transactions a month, and we kept a years worth on-line without any
particular problems (this was Oracle on an HP box). It's hard to imagine
you'll get anything like that volume of transactions.

The concurrent get 7 millions/month. I won't achieve this for a long time,
but as what I see, I've about 50'000 search/month, and I don't store them
yet.

I've 2 choices:
1) store a table with every different search and an other where I store the
searchID, the timestamp and the session_ID.
But I've to check if the record already exist at every new search.
2) store everything in the same table, but a lot of same records will exist
in the table.

in example1, I may have this kind of result.
table a:
(ID,Make,pricef rom, priceto,kmfrom, kmto,....)
1,BMW, ,8000, , ,....
2,Bugatti, 50000,80000, ,100000,...
3,BMW, , , ,150000,...

table b:
(aID,timestamp, session_id)
1,10:10:42 2004-10-10,dfasdfsdfsdf
2,10:10:48 2004-10-10,fdsadsfdsdsv
3,10:10:55 2004-10-10,fsdvgrtwtrwt
2,10:11:02 2004-10-10,ffdsfvvvddvd
2,10:11:20 2004-10-10,ffshhzjuztur z

at beginning, the table a would be filled very quickly, but later, I'll have
less new data there.

Flexibility wouldn't be affected by such choices, only speed and diskspace.
So what's the best choice ?
Does it exist a way to store a record in a table, and if it already exist,
retrieve the ID ? That would be the faster way I think
Jul 17 '05 #4
Bob Bedford wrote:
I'll explain differently what I'm trying to do.
I've to store all possible data for informing the companies that signed with our website what the people choice are.
Actually I took example of the shoes, for simplify. The real items in the site are used cars.
If they are 80% of the customers that are looking for Chevrolet, then the car dealer may buy Chevrolet instead of BMW for their customers. If they know that 90% are looking for a car between 5000 and 8000$, they would try to provide this kind of car. So, everytime somebody make a search, I may log what he was looking for, then build statistics for the car dealers. Since not all fields are compulsory, a customer may chose the price, an other the color, the other one just the make. I want to optimize it as much as
possible but keeping it flexible for doing any kind of statistics.

<snip>

Just another thought... Don't log anything with DB. Just use access
log for data mining. Probably, you have to use a cron job once in a
week that does the data mining and provides you enough stats or
trims/groups the data that available in access log.

--
<?php echo 'Just another PHP saint'; ?>
Email: rrjanbiah-at-Y!com Blog: http://rajeshanbiah.blogspot.com/

Jul 17 '05 #5

"Bob Bedford" <be******@YouKn owWhatToDoHereh otmail.com> wrote

First, thanks for replying.
Too much time on my hands...
I'll explain differently what I'm trying to do.
I've to store all possible data for informing the companies that signed
with our website what the people choice are.
Actually I took example of the shoes, for simplify. The real items in the
site are used cars.
If they are 80% of the customers that are looking for Chevrolet, then the
car dealer may buy Chevrolet instead of BMW for their customers. If they
know that 90% are looking for a car between 5000 and 8000$, they would try
to provide this kind of car. So, everytime somebody make a search, I may
log what he was looking for, then build statistics for the car dealers.
Since not all fields are compulsory, a customer may chose the price, an
other the color, the other one just the make. I want to optimize it as
much as possible but keeping it flexible for doing any kind of statistics.
It's a design mistake to try and optimize storage. It limits your choices
down the line. When I was younger, I did this several times, and it was
always a mistake in the long run.

Log the entire choice row. Then you can cook it down and digest it if you
need to for further analysis. 50,000 rows a month isn't much of a problem
on a fast box with fast disks. Keeping *all* fields in the log will give
you maximum flexibility and ease in conducting ad-hoc querying etc.
For a log, just record the raw event. When you say I've to log the row event, I don't know what you mean. My
form has many fields, and I must save only the used ones, not the ones
left empty.
Is someone looking over your shoulder and saying "you cannot save a field in
a row if it may be blank x% of the time?" If you want to design
heirarchical schema with "search" at the top, and many other types of
zero-to-1 entities beneath it (eg, mfg, color, miles, cost, upholstery,
etc), have at it. This will save storage but slow down querying (cause
you'll need lots of joining), and make querying always be a multiple table
join (a pain, if you do a lot of them). De-normalization (eg, one row per
search event) is used all the time where data mining needs to be quick and
convenient.
I'll use the part ID, but how to store any price if they are filled by the
customer itself ? I don't provide any preset value for price.
Leave them null, or default to zero, your pick. Nulls are sometimes a bit
of a pain to deal with syntactically, as function arguments, etc.
I'd also add timestamp, source, and session_id fields to each row.

What you mean for "source" to each row ?


Originating url or IP address, just in case someone is messing with you...
Why should I store session_id ? this will let me know if the same person
looked at different articles, but since they are too many search, it would
be hard to get an interesting result.
If you're in the design mentality, you don't prejudge what may or may not be
interesting.
Maybe I may say: people looking for a BMW are also looking for an AUDI.
Shopping behavior is very interesting to marketing types. You may get clues
for suggestive selling from analyzing search habits.
Don't design to minimize physical storage. Design for maximum
flexibility.

My problem is saving "time". I mean if my search makes 5 seconds because
of bad DB design or other, it may be good for trashbin. I want maximum
speed and flexibility.


Waiting a few seconds for a research query is perfectly ok. In my
experience (designing fairly large systems for 25 yrs) sub-second response
times are important for high volume things like customer activity, order
entry, and the like. Research (which is what your search history table is
for) involves people thinking about what they want, looking at the results,
and thinking some more. Educate your customer that processing historical
data takes a few seconds.
The concurrent get 7 millions/month. I won't achieve this for a long time,
but as what I see, I've about 50'000 search/month, and I don't store them
yet.
A few hundred thousand rows in a table or database is not much of a
challenge for a modern database. If you're worried, create a table with a
couple hundred k rows and run some queries against it.
1) store a table with every different search and an other where I store
the searchID, the timestamp and the session_ID.
But I've to check if the record already exist at every new search.
2) store everything in the same table, but a lot of same records will
exist in the table.
IMO, you're over-thinking this one. Just store a row per search and be done
with it.

Flexibility wouldn't be affected by such choices, only speed and
diskspace. So what's the best choice ?
Simplicity. One row per search event.
Does it exist a way to store a record in a table, and if it already exist,
retrieve the ID ? That would be the faster way I think


Every insertion will be slower that way. If you want user entry speed
maximized, this isn't the way to go.

Just the thoughts of an old designer....

-- Dan
Jul 17 '05 #6
> Just another thought... Don't log anything with DB. Just use access
log for data mining. Probably, you have to use a cron job once in a
week that does the data mining and provides you enough stats or
trims/groups the data that available in access log.


Since I use session variables trough pages, and also POST datas, access log
won't be possible, as those datas aren't shown in access log.

I'm thinking of logging datas in a table, then use a cron for "packing it a
way that takes less space, and does fit better the way datas may be stored.
The fast way is kept, and the space is also saved. letting me keep data for
a long time without caring about lost space.

Don't know if it's the best way..but it's quite a compromise fitting best to
my needs.
Bob
Jul 17 '05 #7
> Just the thoughts of an old designer....

....with probably far more experience than me. I'll take your advice and I
may always change later. This way will also be faster to implement.

Thanks for help.

Bob
Jul 17 '05 #8
Bob Bedford wrote:
Just another thought... Don't log anything with DB. Just use access
log for data mining. Probably, you have to use a cron job once in a
week that does the data mining and provides you enough stats or
trims/groups the data that available in access log.
Since I use session variables trough pages, and also POST datas,

access log won't be possible, as those datas aren't shown in access log.

<snip>

Since, it is a search page and if I'm an enduser, I would expect it
to be a GET. Can't you just transform it to GET? Also, you *can*
actually get the session or any PHP variables in access log.

--
<?php echo 'Just another PHP saint'; ?>
Email: rrjanbiah-at-Y!com Blog: http://rajeshanbiah.blogspot.com/

Jul 17 '05 #9

"R. Rajesh Jeba Anbiah" <ng**********@r ediffmail.com> a écrit dans le
message de news: 11************* *********@z14g2 00...legr oups.com...
Bob Bedford wrote:
> Just another thought... Don't log anything with DB. Just use access
> log for data mining. Probably, you have to use a cron job once in a
> week that does the data mining and provides you enough stats or
> trims/groups the data that available in access log.


Since I use session variables trough pages, and also POST datas,

access log
won't be possible, as those datas aren't shown in access log.

<snip>

Since, it is a search page and if I'm an enduser, I would expect it
to be a GET. Can't you just transform it to GET? Also, you *can*
actually get the session or any PHP variables in access log.


I'm actually transforming to GET. what a mistake to use POST for search ! I
didn't know. Doesn't matter, I rewrite part of my site and improve it.

The session ID is available by using SID variable. Don't know the size of
it, so how to store it in a table ? wath type of field should I set for it ?

Bob
Jul 17 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

15
2216
by: John J | last post by:
I've written the following code into a class to search for and display the results of all races entered (The complete code is in a previous thread). I wish to amend the code so as to display the best result only. Can anyone suggest a simple amendment to the following that will result in only the best result being displayed?
136
9251
by: Matt Kruse | last post by:
http://www.JavascriptToolbox.com/bestpractices/ I started writing this up as a guide for some people who were looking for general tips on how to do things the 'right way' with Javascript. Their code was littered with document.all and eval, for example, and I wanted to create a practical list of best practices that they could easily put to...
20
11293
by: hagai26 | last post by:
I am looking for the best and efficient way to replace the first word in a str, like this: "aa to become" -> "/aa/ to become" I know I can use spilt and than join them but I can also use regular expressions and I sure there is a lot ways, but I need realy efficient one
3
3991
by: Irene | last post by:
Hi all, I have set up a simple VB program (and later on an ASP interface) to manage an Athletics database. I'm using Access 2000. To simplify, I have the Athlets, the Competitions and the Scores tables. When I want to list of the best scores/ranking, I just do:
5
1805
by: l.woods | last post by:
I want your recommendation on which ASP.NET Shopping Cart software I should buy? Best code Best documentation Best support (if needed. I will buying source code, if possible) TIA, Larry Woods
10
3426
by: jojobar | last post by:
Hello, I am trying to use vs.net 2005 to migrate a project originally in vs.net 2003. I started with creation of a "web site", and then created folders for each component of the site. I read somewhere that each folder under the "web site" is compiled in separate assembly. I however, did not find that the "web site" creation in vs.net...
10
2972
by: Ren | last post by:
Hi All, I'm still rather new at vb.net and would like to know the proper way to access private varibables in a class. Do I access the variable directly or do I use the public property? public class MyClass private _variableName as integer public property VariableName as integer
4
1831
by: Ned Balzer | last post by:
Hi all, I am pretty new to asp.net; I've done lots of classic asp, but am just beginning to get my mind wrapped around .net. What I'd like to do is include some code that tests if a user is logged in, on each and every page, and redirects the user to a login page if s/he's not logged in. The login page will also take care of some...
24
2178
by: Earl | last post by:
I have all of my data operations in a separate library, so I'm looking for what might be termed "best practices" on a return type from those classes. For example, let's say I send an update from the UI layer to a method in a library class that calls the stored procedure. Best to return a boolean indicating success/failure, return a string with...
11
4671
by: ankitmathur | last post by:
Hi, I'm trying to overcome a situation whereby I have to search through 4-5 columns and produce the results with an order by according to the values matched in these columns. Example: My Table Structure: Create Table TestPfx
0
8139
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
1
7682
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
7984
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
0
6290
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
1
5515
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
5228
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3663
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
1
2119
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
0
954
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.