473,508 Members | 2,282 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

performance of query for normalized DB

I'm struggling to find the best query from performance point of view
and readability for a normalized DB design.
To illustrate better my question on whether normalized designs lead to
more complex queries yes or no, I have prepared an example. The example
is a database with the following tables:
*table person with fields:
-persid: autoincrement id
-name: name of the person
*table material with fields:
-materialid: autoincrement id
-material: name of the material eg "wood"
*table color with fields:
-colorid: autoincrement id
-color: name of the color eg "green"
*table persmaterial with fields:
-persmatid: autoincrement id
-persid: link to table person
-materialid: link to table material
*table perscolor with fields:
-perscolorid: autoincrement id
-persid: link to table person
-colorid: link to table color
Using these tables it is straightforward to store the preference of a
certain person for colors and materials.
Now use the statements at the end of the post to create the tables and
populate them with intial values. Next we want to find all persons who
like the colors red or blue and also like the color green. Furthermore
the person should have a preference for iron as material. I understand
there are several ways to obtain the requested result.
Option 1:
The initial query can be written in pseudocode as
Find all persons that (like as color (red OR blue) AND green) AND that
(like as material iron). I do not see right away how to write this as a
query with joins so I rewrite the pseudo query as follows:
Find all persons that ((like as color red AND green) AND that (like as
material iron)) OR ((like as color blue AND green) AND that (like as
material iron))
this leads to the following query
(SELECT p.persid FROM person p, perscolor pc, persmaterial pm WHERE
p.persid=pc.persid AND (pc.colorid=1 OR pc.colorid=2) AND
p.persid=pm.persid AND pm.materialid=2 GROUP BY p.persid HAVING
(count(DISTINCT pc.colorid)=2 AND count(DISTINCT pm.materialid)=1))
UNION
(SELECT p.persid FROM person p, perscolor pc, persmaterial pm WHERE
p.persid=pc.persid AND (pc.colorid=2 OR pc.colorid=3) AND
p.persid=pm.persid AND pm.materialid=2 GROUP BY p.persid HAVING
(count(DISTINCT pc.colorid)=2 AND count(DISTINCT pm.materialid)=1))
You can say that for someone familiar to sql this is not overly
complicated but the problem I see is the following. I want to use this
database in a webapplication where a user will be able to find persons
with certain preferences. The user will be allowed to play around with
AND and ORs in his request. To state it differently the user will have
the possibility to type in a pseudo query like the example I used
above. As you have seen above, to get to the final sql query I need to
do a transformation of the pseudo query to the sql query. Secondly, if
we have many criteria you can see easily that the number of unions can
grow exponentially if the user starts playing around with ANDs and ORs
for a certain property.

Option 2:
The pseudo query
Find all persons that (like as color (red OR blue) AND green) AND that
(like as material iron)
can straightforwardly be coded in a query using subqueries:
SELECT persid FROM person p WHERE
(EXISTS(SELECT * FROM perscolor pc WHERE pc.colorid=1 AND
p.persid=pc.persid)
OR
EXISTS(SELECT * FROM perscolor pc WHERE pc.colorid=3 AND
p.persid=pc.persid))
AND
EXISTS(SELECT * FROM perscolor pc WHERE pc.colorid=2 AND
p.persid=pc.persid)
AND
EXISTS(SELECT * FROM persmaterial pm WHERE pm.materialid=2 AND
p.persid=pm.persid)

Both options are quite different. I am no expert but this is my
understanding of the pros and contras of the 2 options:
*option 1:
-difficult to automatically generate the query from the pseudo query
*option 2:
+easy to generate query from pseudo query
-are subqueries not inefficient? If for each subquery you have to go
over the parameters outside the subquery, it means pretty much that for
each subquery you go over the entire person table; I might be wrong and
subqueries might be more optimized but I have no idea on this

Overall, I would like to know
*if there are other options to translate the above pseudo query into an
sql query?
*what of the otpions (proposed + new ones) are best from a performance
point of view?

-- phpMyAdmin SQL Dump
-- version 2.6.1
-- http://www.phpmyadmin.net
--
-- Host: localhost
-- Generation Time: Oct 19, 2006 at 01:13 PM
-- Server version: 4.1.9
-- PHP Version: 4.3.10
--
-- Database: `aston`
--

-- --------------------------------------------------------

--
-- Table structure for table `color`
--

CREATE TABLE `color` (
`colorid` int(11) NOT NULL auto_increment,
`color` varchar(30) NOT NULL default '',
PRIMARY KEY (`colorid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=5 ;

--
-- Dumping data for table `color`
--

INSERT INTO `color` VALUES (1, 'red');
INSERT INTO `color` VALUES (2, 'green');
INSERT INTO `color` VALUES (3, 'blue');
INSERT INTO `color` VALUES (4, 'yellow');

-- --------------------------------------------------------

--
-- Table structure for table `material`
--

CREATE TABLE `material` (
`materialid` int(11) NOT NULL auto_increment,
`material` varchar(30) NOT NULL default '',
PRIMARY KEY (`materialid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=3 ;

--
-- Dumping data for table `material`
--

INSERT INTO `material` VALUES (1, 'wood');
INSERT INTO `material` VALUES (2, 'iron');

-- --------------------------------------------------------

--
-- Table structure for table `perscolor`
--

CREATE TABLE `perscolor` (
`perscolorid` int(11) NOT NULL auto_increment,
`persid` int(11) NOT NULL default '0',
`colorid` int(11) NOT NULL default '0',
PRIMARY KEY (`perscolorid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=7 ;

--
-- Dumping data for table `perscolor`
--

INSERT INTO `perscolor` VALUES (1, 1, 1);
INSERT INTO `perscolor` VALUES (2, 1, 2);
INSERT INTO `perscolor` VALUES (3, 2, 1);
INSERT INTO `perscolor` VALUES (5, 3, 3);
INSERT INTO `perscolor` VALUES (6, 3, 2);

-- --------------------------------------------------------

--
-- Table structure for table `persmaterial`
--

CREATE TABLE `persmaterial` (
`persmatid` int(11) NOT NULL auto_increment,
`persid` int(11) NOT NULL default '0',
`materialid` int(11) NOT NULL default '0',
PRIMARY KEY (`persmatid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=6 ;

--
-- Dumping data for table `persmaterial`
--

INSERT INTO `persmaterial` VALUES (1, 1, 1);
INSERT INTO `persmaterial` VALUES (2, 1, 2);
INSERT INTO `persmaterial` VALUES (3, 2, 1);
INSERT INTO `persmaterial` VALUES (5, 3, 2);

-- --------------------------------------------------------

--
-- Table structure for table `person`
--

CREATE TABLE `person` (
`persid` int(11) NOT NULL auto_increment,
`name` varchar(30) NOT NULL default '',
PRIMARY KEY (`persid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=4 ;

--
-- Dumping data for table `person`
--

INSERT INTO `person` VALUES (1, 'john');
INSERT INTO `person` VALUES (2, 'emily');
INSERT INTO `person` VALUES (3, 'liz');

Oct 19 '06 #1
0 3282

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
14577
by: Mark Hoffman | last post by:
I'm a newbie at Oracle..Be gentle! I have a table that stores information (WMI data) about computers on our network. The table looks like: ComputerID ItemID Class Property Value
4
2962
by: DBNovice | last post by:
I have a database that keeps records on the issue and failure of an item. Currently, the database is poorly desisned; therefore I'm performing queries to break the data into normalized tables and...
4
2021
by: sheree | last post by:
I have 3 tables (amoung a few others) in a small access database. The tables are as follows: == AEReport -------- AEID (PK) RptCatelog GCRCID PatientID EvntDate
10
15341
by: Marizel | last post by:
I'm not sure there's an easy solution to this, but thought I'd ask. I often find myself with a query which I'd like to reuse, but with a different datasource. These datasources generally have...
5
1417
by: MattPF | last post by:
I have a table that is -- 30 Megabytes 90,000 rows ~65 columns My query goes SELECT city FROM table WHERE zip = 90210; It will then find about 10 matching records.
4
1407
by: Eric | last post by:
I am trying to get the technician contractor(TECHCONT) of last field technician (FS_TechID or LstVldTech). It could be in Tbl_ValidDispute or Tbl_PPVResearch. In table Tbl_PPVResearch first i...
2
1945
by: teser3 | last post by:
I would like to know if it is possible to do a pattern match or better way to condense my below query in Access 2003. Basically I need to query the average with all fields in TableOne that end...
3
3541
by: Henrik Juul | last post by:
I have the following 2 tables: (BATCHES) BatchID KEY ID OrderID Action1DateTime Action2DateTime Action3DateTime Action4DateTime
1
1301
by: jslssuze | last post by:
I have 5 tables in access 1 - depth - normalized 2 - duration - normalized 3 - location - normalized 4 - type - normalized 5 - all data - denormalized with lookups to other tables and full...
0
7229
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7129
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
1
7061
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
5637
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
1
5057
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
3208
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
1566
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
1
769
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
428
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.