470,648 Members | 1,455 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 470,648 developers. It's quick & easy.

need help for my honours project

Hi,
My name is alexandre, I am 4th year student at the Napier university
of edinburgh.
I am finishing my year and do a presentation of my honours project
next wednesday (the 5th of May).

I am doing a project of data mining, and I run a mysql database to
store my data.
the database structure:

#
# Table structure for table `session`
#

CREATE TABLE session (
key int(11) NOT NULL auto_increment,
ID int(11) default '0',
Quo tinytext NOT NULL,
Score int(3) default '0',
SQL mediumblob,
PRIMARY KEY (key)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

# --------------------------------------------------------

#
# Table structure for table `clean_session`
#

CREATE TABLE clean_session (
SQL mediumblob,
Quo tinytext NOT NULL,
TOTAL bigint(21) NOT NULL default '0'
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

The clean_session table is a table containing a list of SQL query
which appear at least 2 time in session and at most 5 times. The table
is generated using that SQL:

CREATE TABLE clean_$session
SELECT t1.SQL, t1.Quo, COUNT(*) AS TOTAL
FROM $session AS t1 LEFT JOIN $session AS t2 ON ((t1.SQL = t2.SQL)
AND (t1.Quo = t2.Quo))
WHERE t1.ID <> t2.ID
GROUP BY t1.SQL
HAVING (COUNT(*) >= $sql_min AND COUNT(*) <= $sql_max)
ORDER BY TOTAL ASC

I would like to retrieve the number of queries from the table session
which are as well in the table clean_session. The queries will have
two parameters: the length and the score.

Here is the SQL Is used to generate a list of the people along with
the total number of queries that they have in the first table and in
the second table as well, with a $length = 80 and a $score = 100:

SELECT t1.ID, COUNT(*) as TOTAL
FROM $session AS t1 LEFT JOIN clean_$session AS t2 ON ((t1.SQL =
t2.SQL) AND (t1.Quo = t2.Quo))
WHERE t1.Score < $score AND LENGTH(CAST(t2.SQL AS CHAR)) > $length
GROUP BY t1.ID
ORDER BY TOTAL DESC

I get a list of ID and TOTAL:
Total ID
2 35
2 340
2 361
1 367
1 386
1 402
1 64
1 103
1 107
1 134
1 174
1 250
1 302
1 346

Now, I would like to get for each ID a list of the other people which
have the same queries in their data as the one found previously. which
means that for the ID 35, I get 2 queries with a length > 80 an a
score < 100. This query occurs more than 1 time in the table session.
Therefore, I should be able to find at least 1 other people for each
query. The details for people ID=35 should include a least 2 people
having a query with a score > 100 and a length >80.
I use that SQL to get the details where $id = 35 $length = 80:

SELECT t3.ID, t1.Quo, CAST(t3.SQL AS CHAR), t3.Score
FROM $session AS t1 LEFT JOIN clean_$session as t2 ON ((t1.SQL =
t2.SQL) AND (t2.Quo = t1.Quo)) LEFT JOIN $session AS t3 ON ((t3.SQL =
t2.SQL) AND (t2.Quo = t1.Quo))
WHERE t1.ID = $id AND t3.ID IS NOT NULL AND t3.ID <> $id AND
LENGTH(CAST(t3.SQL AS CHAR)) > $length
GROUP BY t3.ID, t1.Quo
ORDER BY t3.ID

I want the good and bad queries in the details, so I do not include
the score in the WHERE clause.

I get that as a result:
ID2 Score Quo SQL
119 100 4.5
155 100 3.14
155 100 4.11
155 100 4.12
155 100 4.4
155 100 4.6
155 100 4.8
155 100 4.9
175 100 4-.1
214 100 2.16
340 26 4.3
361 5 3.14
361 0 4-.1
361 100 4.11
361 100 4.12
361 26 4.3
361 100 4.4
361 100 4.5
361 100 4.6
361 100 4.8
361 100 4.9
370 100 2.16
370 100 3.14
370 100 4-.1
370 100 4.11
370 100 4.12
370 100 4.4
370 100 4.5
370 100 4.6
370 100 4.8
370 100 4.9
389 100 4.12
389 100 4.4
389 100 4.6
389 100 4.8
389 100 4.9
394 100 2.16

I did not include the SQL detqils... this not relevant here and
confidential.

As you can see, I get 3 different queries which are bellow 100. The
there here inconsistency in my results. I should get only 2 queries
bellow 100. The problem occurs at some time, I do not know how to fix
it. The main problem will be that the SQL used for the second or the
first query would be wrong. I need the help of someone more
experienced than me.
I would be really thankfull if someone could help me sorting this out
as I have my presentation next wednesday (the 5th) and I do not want
inconsistency in my results obviously.

Any help greatly apreciated !

if you read all of this, thank you for your time.

Alexandre.
Jul 20 '05 #1
2 1515
Ok, I saw the light...

There is no need to use join for the creation of the cleaned query.
There is no need to do a group by and an order by when building the
list of the queries for the result.

I am sorry if I consumed much time to somebody.

Alexandre.
Jul 20 '05 #2
Ok, I saw the light...

There is no need to use join for the creation of the cleaned query.
There is no need to do a group by and an order by when building the
list of the queries for the result.

I am sorry if I consumed much time to somebody.

Alexandre.
Jul 20 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

3 posts views Thread by Andrej Hristoliubov | last post: by
6 posts views Thread by sivashankar | last post: by
7 posts views Thread by moondaddy | last post: by
5 posts views Thread by info | last post: by
8 posts views Thread by Brett Romero | last post: by
reply views Thread by garimaarora | last post: by
reply views Thread by warner | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.