473,769 Members | 3,923 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Optimising MySQL queries against huge databases?

I have a database of movie titles, with about 78,000 records, and a
database of related people (directors, writers, actors/actresses etc.)
with about 141,000 records. I display a random movie out of this
database on each hit to my website's homepage.

This worked fine when I had only a couple thousand movies, but now that
the DB has grown, it seems to be taking a bit longer to process the
page.

My DB schema for each table:

CREATE TABLE `movies` (
`id` int(11) NOT NULL auto_increment,
`name` varchar(200) NOT NULL default '',
`uri` varchar(100) NOT NULL default '',
`year` year(4) NOT NULL default '0000',
`released` date NOT NULL default '0000-00-00',
`imdb` int(11) NOT NULL default '0',
`allmovies` varchar(25) NOT NULL default '',
`length` int(11) NOT NULL default '0',
`colour` enum('c','b') NOT NULL default 'c',
`sound` varchar(25) NOT NULL default '',
`director` int(11) NOT NULL default '0',
`writer` int(11) NOT NULL default '0',
`asin` varchar(25) NOT NULL default '',
`image` blob NOT NULL,
`genre` varchar(25) NOT NULL default '',
`fatso` int(11) NOT NULL default '0',
PRIMARY KEY (`id`),
UNIQUE KEY `imdb` (`imdb`),
KEY `name` (`name`),
FULLTEXT KEY `name_2` (`name`)
) TYPE=MyISAM AUTO_INCREMENT= 78483 ;

CREATE TABLE `people` (
`id` int(11) NOT NULL auto_increment,
`name` varchar(100) NOT NULL default '',
`born` date NOT NULL default '0000-00-00',
`died` date NOT NULL default '0000-00-00',
`imdb` int(11) NOT NULL default '0',
`allmusic` varchar(25) NOT NULL default '',
`allmovies` varchar(25) NOT NULL default '',
`uri` varchar(100) NOT NULL default '',
`image` blob NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `imdb` (`imdb`),
KEY `name` (`name`),
FULLTEXT KEY `name_2` (`name`)
) TYPE=MyISAM AUTO_INCREMENT= 141623 ;

The SQL query I am using to fetch a random movie is:

SELECT movies.id, movies.name, movies.asin, movies.fatso, people.id AS
directorid, people.name AS director
FROM movies, people
WHERE movies.image<>' ' AND people.id=movie s.director
ORDER BY RAND() LIMIT 1;

Jul 17 '05 #1
2 2100
In article <11************ *********@z14g2 000cwz.googlegr oups.com>,
"Jasper Bryant-Greene" <ja******@gmail .com> wrote:
I have a database of movie titles, with about 78,000 records, and a
database of related people (directors, writers, actors/actresses etc.)
with about 141,000 records. I display a random movie out of this
database on each hit to my website's homepage.

This worked fine when I had only a couple thousand movies, but now that
the DB has grown, it seems to be taking a bit longer to process the
page.

My DB schema for each table:

CREATE TABLE `movies` (
`id` int(11) NOT NULL auto_increment,
`name` varchar(200) NOT NULL default '',
`uri` varchar(100) NOT NULL default '',
`year` year(4) NOT NULL default '0000',
`released` date NOT NULL default '0000-00-00',
`imdb` int(11) NOT NULL default '0',
`allmovies` varchar(25) NOT NULL default '',
`length` int(11) NOT NULL default '0',
`colour` enum('c','b') NOT NULL default 'c',
`sound` varchar(25) NOT NULL default '',
`director` int(11) NOT NULL default '0',
`writer` int(11) NOT NULL default '0',
`asin` varchar(25) NOT NULL default '',
`image` blob NOT NULL,
`genre` varchar(25) NOT NULL default '',
`fatso` int(11) NOT NULL default '0',
PRIMARY KEY (`id`),
UNIQUE KEY `imdb` (`imdb`),
KEY `name` (`name`),
FULLTEXT KEY `name_2` (`name`)
) TYPE=MyISAM AUTO_INCREMENT= 78483 ;

CREATE TABLE `people` (
`id` int(11) NOT NULL auto_increment,
`name` varchar(100) NOT NULL default '',
`born` date NOT NULL default '0000-00-00',
`died` date NOT NULL default '0000-00-00',
`imdb` int(11) NOT NULL default '0',
`allmusic` varchar(25) NOT NULL default '',
`allmovies` varchar(25) NOT NULL default '',
`uri` varchar(100) NOT NULL default '',
`image` blob NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `imdb` (`imdb`),
KEY `name` (`name`),
FULLTEXT KEY `name_2` (`name`)
) TYPE=MyISAM AUTO_INCREMENT= 141623 ;

The SQL query I am using to fetch a random movie is:

SELECT movies.id, movies.name, movies.asin, movies.fatso, people.id AS
directorid, people.name AS director
FROM movies, people
WHERE movies.image<>' ' AND people.id=movie s.director
ORDER BY RAND() LIMIT 1;


what does the EXPLAIN <query> tell you? It should show the strategy
used by optimizer. You may have to do some rephrasing. Have you posted
this in a mysql group:

http://groups-beta.google.com/group/alt.php.sql
http://groups-beta.google.com/group/...database.mysql

--
DeeDee, don't press that button! DeeDee! NO! Dee...

Jul 17 '05 #2
Hi Jasper,

I agree with Michael about the explain, but i did not try that, so the
following are just guesses:
- there is no index on movies.image so mySQL must sequentially search
throuhg all movies that have a director (probably all or close to all 141000
- there are probably MANY movies that have an image and a director.
Unless MySQL has a special optimization for ORDER BY RAND() LIMIT ..
it may have to random-sort all of them. That will take quite some time,
i guess. So if an index on movies.image does not help, you may have to
add AND id = RAND() to the SELECT condition, with the RAND function
parameterized or multiplied and rounded or floored to get whole numbers
between the lowest and highest id in the database. And then repeat the
query until it does return a row. (if many rows have been removed you
may have to pack the id's of movies)

Success,

Henk Verhoeven,
www.phpPeanuts.org.
Jul 17 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
1837
by: Marcus | last post by:
Quick question on how I setup my mysql database(s)... If my setup is such that I have multiple clients, and each client gets 10 tables to store their data, is it better for performance if I put all of these tables into 1 large database (with a unique identifier prepended to each one) or create a separate database for each client? i.e., assuming 1000 clients, I either would have: A) 1 database and 10,000 tables in it, 10,000 total tables
2
1905
by: aum | last post by:
Hi, I'm looking for an object-relational layer, which can wrap a MySQL database into highly pythonic objects. There seem to be dozens of packages which provide some of this functionality, but all the ones I've looked at so far have major drawbacks. What I'm looking for in an object-relational wrapper layer is:
0
3527
by: Lenz Grimmer | last post by:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, MySQL 4.0.14, a new version of the popular Open Source/Free Software Database, has been released. It is now available in source and binary form for a number of platforms from our download pages at http://www.mysql.com/downloads/ and mirror sites.
2
2540
by: Brian | last post by:
Hello I am using version 4.0.12-nt of MySQL and when I hit the enter key rapidly I can't connect to the database. The result is a message is returned to me from mysql that says I can't connect because the database is down. If I only hit the 'enter' key once (to execute the page that my queries are on(their are four queries on this particular web page)) my queries execute just fine. We have noticed that this only happens when we:
1
1799
by: Jasper Bryant-Greene | last post by:
I have a database of movie titles, with about 78,000 records, and a database of related people (directors, writers, actors/actresses etc.) with about 141,000 records. I display a random movie out of this database on each hit to my website's homepage. This worked fine when I had only a couple thousand movies, but now that the DB has grown, it seems to be taking a bit longer to process the page. My DB schema for each table:
6
6784
by: Andreas Lauffer | last post by:
I changed from Access97 to AccessXP and I have immense performance problems. Details: - Access XP MDB with Jet 4.0 ( no ADP-Project ) - Linked Tables to SQL-Server 2000 over ODBC I used the SQL Profile to watch the T-SQL-Command which Access ( who creates the commands?) creates and noticed:
175
11510
by: Sai Hertz And Control Systems | last post by:
Dear all, Their was a huge rore about MySQL recently for something in java functions now theirs one more http://www.mysql.com/doc/en/News-5.0.x.html Does this concern anyone. What I think is PostgreSQL would have less USP's (Uniqe Selling Points
39
8428
by: Mairhtin O'Feannag | last post by:
Hello, I have a client (customer) who asked the question : "Why would I buy and use UDB, when MySql is free?" I had to say I was stunned. I have no experience with MySql, so I was left sort of stammering and sputtering, and managed to pull out something I heard a couple of years back - that there was no real transaction safety in MySql. In flight transactions could be lost.
6
38519
Atli
by: Atli | last post by:
This is an easy to digest 12 step guide on basics of using MySQL. It's a great refresher for those who need it and it work's great for first time MySQL users. Anyone should be able to get through this without much trouble. Programming knowledge is not required. Index What is SQL? Why MySQL? Installing MySQL. Using the MySQL command line interface
0
9589
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10216
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9865
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
7413
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6675
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5309
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5448
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3965
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
2815
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.