performance of query for normalized DB

phlype.johnson

I'm struggling to find the best query from performance point of view
and readability for a normalized DB design.
To illustrate better my question on whether normalized designs lead to
more complex queries yes or no, I have prepared an example. The example
is a database with the following tables:
*table person with fields:
-persid: autoincrement id
-name: name of the person
*table material with fields:
-materialid: autoincrement id
-material: name of the material eg "wood"
*table color with fields:
-colorid: autoincrement id
-color: name of the color eg "green"
*table persmaterial with fields:
-persmatid: autoincrement id
-persid: link to table person
-materialid: link to table material
*table perscolor with fields:
-perscolorid: autoincrement id
-persid: link to table person
-colorid: link to table color
Using these tables it is straightforward to store the preference of a
certain person for colors and materials.
Now use the statements at the end of the post to create the tables and
populate them with intial values. Next we want to find all persons who
like the colors red or blue and also like the color green. Furthermore
the person should have a preference for iron as material. I understand
there are several ways to obtain the requested result.
Option 1:
The initial query can be written in pseudocode as
Find all persons that (like as color (red OR blue) AND green) AND that
(like as material iron). I do not see right away how to write this as a
query with joins so I rewrite the pseudo query as follows:
Find all persons that ((like as color red AND green) AND that (like as
material iron)) OR ((like as color blue AND green) AND that (like as
material iron))
this leads to the following query
(SELECT p.persid FROM person p, perscolor pc, persmaterial pm WHERE
p.persid=pc.persid AND (pc.colorid=1 OR pc.colorid=2) AND
p.persid=pm.persid AND pm.materialid=2 GROUP BY p.persid HAVING
(count(DISTINCT pc.colorid)=2 AND count(DISTINCT pm.materialid)=1))
UNION
(SELECT p.persid FROM person p, perscolor pc, persmaterial pm WHERE
p.persid=pc.persid AND (pc.colorid=2 OR pc.colorid=3) AND
p.persid=pm.persid AND pm.materialid=2 GROUP BY p.persid HAVING
(count(DISTINCT pc.colorid)=2 AND count(DISTINCT pm.materialid)=1))
You can say that for someone familiar to sql this is not overly
complicated but the problem I see is the following. I want to use this
database in a webapplication where a user will be able to find persons
with certain preferences. The user will be allowed to play around with
AND and ORs in his request. To state it differently the user will have
the possibility to type in a pseudo query like the example I used
above. As you have seen above, to get to the final sql query I need to
do a transformation of the pseudo query to the sql query. Secondly, if
we have many criteria you can see easily that the number of unions can
grow exponentially if the user starts playing around with ANDs and ORs
for a certain property.

Option 2:
The pseudo query
Find all persons that (like as color (red OR blue) AND green) AND that
(like as material iron)
can straightforwardly be coded in a query using subqueries:
SELECT persid FROM person p WHERE
(EXISTS(SELECT * FROM perscolor pc WHERE pc.colorid=1 AND
p.persid=pc.persid)
OR
EXISTS(SELECT * FROM perscolor pc WHERE pc.colorid=3 AND
p.persid=pc.persid))
AND
EXISTS(SELECT * FROM perscolor pc WHERE pc.colorid=2 AND
p.persid=pc.persid)
AND
EXISTS(SELECT * FROM persmaterial pm WHERE pm.materialid=2 AND
p.persid=pm.persid)

Both options are quite different. I am no expert but this is my
understanding of the pros and contras of the 2 options:
*option 1:
-difficult to automatically generate the query from the pseudo query
*option 2:
+easy to generate query from pseudo query
-are subqueries not inefficient? If for each subquery you have to go
over the parameters outside the subquery, it means pretty much that for
each subquery you go over the entire person table; I might be wrong and
subqueries might be more optimized but I have no idea on this

Overall, I would like to know
*if there are other options to translate the above pseudo query into an
sql query?
*what of the otpions (proposed + new ones) are best from a performance
point of view?

-- phpMyAdmin SQL Dump
-- version 2.6.1
-- http://www.phpmyadmin.net
--
-- Host: localhost
-- Generation Time: Oct 19, 2006 at 01:13 PM
-- Server version: 4.1.9
-- PHP Version: 4.3.10
--
-- Database: `aston`
--

-- --------------------------------------------------------

--
-- Table structure for table `color`
--

CREATE TABLE `color` (
`colorid` int(11) NOT NULL auto_increment,
`color` varchar(30) NOT NULL default '',
PRIMARY KEY (`colorid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=5 ;

--
-- Dumping data for table `color`
--

INSERT INTO `color` VALUES (1, 'red');
INSERT INTO `color` VALUES (2, 'green');
INSERT INTO `color` VALUES (3, 'blue');
INSERT INTO `color` VALUES (4, 'yellow');

-- --------------------------------------------------------

--
-- Table structure for table `material`
--

CREATE TABLE `material` (
`materialid` int(11) NOT NULL auto_increment,
`material` varchar(30) NOT NULL default '',
PRIMARY KEY (`materialid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=3 ;

--
-- Dumping data for table `material`
--

INSERT INTO `material` VALUES (1, 'wood');
INSERT INTO `material` VALUES (2, 'iron');

-- --------------------------------------------------------

--
-- Table structure for table `perscolor`
--

CREATE TABLE `perscolor` (
`perscolorid` int(11) NOT NULL auto_increment,
`persid` int(11) NOT NULL default '0',
`colorid` int(11) NOT NULL default '0',
PRIMARY KEY (`perscolorid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=7 ;

--
-- Dumping data for table `perscolor`
--

INSERT INTO `perscolor` VALUES (1, 1, 1);
INSERT INTO `perscolor` VALUES (2, 1, 2);
INSERT INTO `perscolor` VALUES (3, 2, 1);
INSERT INTO `perscolor` VALUES (5, 3, 3);
INSERT INTO `perscolor` VALUES (6, 3, 2);

-- --------------------------------------------------------

--
-- Table structure for table `persmaterial`
--

CREATE TABLE `persmaterial` (
`persmatid` int(11) NOT NULL auto_increment,
`persid` int(11) NOT NULL default '0',
`materialid` int(11) NOT NULL default '0',
PRIMARY KEY (`persmatid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=6 ;

--
-- Dumping data for table `persmaterial`
--

INSERT INTO `persmaterial` VALUES (1, 1, 1);
INSERT INTO `persmaterial` VALUES (2, 1, 2);
INSERT INTO `persmaterial` VALUES (3, 2, 1);
INSERT INTO `persmaterial` VALUES (5, 3, 2);

-- --------------------------------------------------------

--
-- Table structure for table `person`
--

CREATE TABLE `person` (
`persid` int(11) NOT NULL auto_increment,
`name` varchar(30) NOT NULL default '',
PRIMARY KEY (`persid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=4 ;

--
-- Dumping data for table `person`
--

INSERT INTO `person` VALUES (1, 'john');
INSERT INTO `person` VALUES (2, 'emily');
INSERT INTO `person` VALUES (3, 'liz');

Oct 19 '06 #1

Subscribe Reply

3282

Similar topics

14577

Help with DECODE function and crosstab query

by: Mark Hoffman | last post by:

I'm a newbie at Oracle..Be gentle! I have a table that stores information (WMI data) about computers on our network. The table looks like: ComputerID ItemID Class Property Value

Oracle Database

2962

slow query, sql code attached

by: DBNovice | last post by:

I have a database that keeps records on the issue and failure of an item. Currently, the database is poorly desisned; therefore I'm performing queries to break the data into normalized tables and...

MySQL Database

2021

Update query in access - based on data in different table

by: sheree | last post by:

I have 3 tables (amoung a few others) in a small access database. The tables are as follows: == AEReport -------- AEID (PK) RptCatelog GCRCID PatientID EvntDate

Microsoft Access / VBA

15341

Changing query to access different tables

by: Marizel | last post by:

I'm not sure there's an easy solution to this, but thought I'd ask. I often find myself with a query which I'd like to reuse, but with a different datasource. These datasources generally have...

Microsoft Access / VBA

1417

Is it normal for a query on a 90k record table to take 1+ seconds?

by: MattPF | last post by:

I have a table that is -- 30 Megabytes 90,000 rows ~65 columns My query goes SELECT city FROM table WHERE zip = 90210; It will then find about 10 matching records.

MySQL Database

1407

Multiple Query

by: Eric | last post by:

I am trying to get the technician contractor(TECHCONT) of last field technician (FS_TechID or LstVldTech). It could be in Tbl_ValidDispute or Tbl_PPVResearch. In table Tbl_PPVResearch first i...

Microsoft Access / VBA

1945

More efficient way to query fields

by: teser3 | last post by:

I would like to know if it is possible to do a pattern match or better way to condense my below query in Access 2003. Basically I need to query the average with all fields in TableOne that end...

Microsoft Access / VBA

3541

Sub-Query / Cross Join / or something else?

by: Henrik Juul | last post by:

I have the following 2 tables: (BATCHES) BatchID KEY ID OrderID Action1DateTime Action2DateTime Action3DateTime Action4DateTime

Microsoft SQL Server

1301

Need to query database using form on website

by: jslssuze | last post by:

I have 5 tables in access 1 - depth - normalized 2 - duration - normalized 3 - location - normalized 4 - type - normalized 5 - all data - denormalized with lookups to other tables and full...

ASP / Active Server Pages

7229

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

7129

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

7061

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server

5637

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

5057

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

Microsoft Access / VBA

3208

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...

Networking - Hardware / Configuration

1566

transfer the data from one system to another through ip address

by: 6302768590 | last post by:

Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...

C# / C Sharp

769

How to add payments to a PHP MySQL app.

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP

428

Comprehensive Guide to Website Development in Toronto: Expert Insights from BSMN Consultancy

by: bsmnconsultancy | last post by:

In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

General