473,721 Members | 2,230 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

performance of query for normalized DB

I'm struggling to find the best query from performance point of view
and readability for a normalized DB design.
To illustrate better my question on whether normalized designs lead to
more complex queries yes or no, I have prepared an example. The example
is a database with the following tables:
*table person with fields:
-persid: autoincrement id
-name: name of the person
*table material with fields:
-materialid: autoincrement id
-material: name of the material eg "wood"
*table color with fields:
-colorid: autoincrement id
-color: name of the color eg "green"
*table persmaterial with fields:
-persmatid: autoincrement id
-persid: link to table person
-materialid: link to table material
*table perscolor with fields:
-perscolorid: autoincrement id
-persid: link to table person
-colorid: link to table color
Using these tables it is straightforward to store the preference of a
certain person for colors and materials.
Now use the statements at the end of the post to create the tables and
populate them with intial values. Next we want to find all persons who
like the colors red or blue and also like the color green. Furthermore
the person should have a preference for iron as material. I understand
there are several ways to obtain the requested result.
Option 1:
The initial query can be written in pseudocode as
Find all persons that (like as color (red OR blue) AND green) AND that
(like as material iron). I do not see right away how to write this as a
query with joins so I rewrite the pseudo query as follows:
Find all persons that ((like as color red AND green) AND that (like as
material iron)) OR ((like as color blue AND green) AND that (like as
material iron))
this leads to the following query
(SELECT p.persid FROM person p, perscolor pc, persmaterial pm WHERE
p.persid=pc.per sid AND (pc.colorid=1 OR pc.colorid=2) AND
p.persid=pm.per sid AND pm.materialid=2 GROUP BY p.persid HAVING
(count(DISTINCT pc.colorid)=2 AND count(DISTINCT pm.materialid)= 1))
UNION
(SELECT p.persid FROM person p, perscolor pc, persmaterial pm WHERE
p.persid=pc.per sid AND (pc.colorid=2 OR pc.colorid=3) AND
p.persid=pm.per sid AND pm.materialid=2 GROUP BY p.persid HAVING
(count(DISTINCT pc.colorid)=2 AND count(DISTINCT pm.materialid)= 1))
You can say that for someone familiar to sql this is not overly
complicated but the problem I see is the following. I want to use this
database in a webapplication where a user will be able to find persons
with certain preferences. The user will be allowed to play around with
AND and ORs in his request. To state it differently the user will have
the possibility to type in a pseudo query like the example I used
above. As you have seen above, to get to the final sql query I need to
do a transformation of the pseudo query to the sql query. Secondly, if
we have many criteria you can see easily that the number of unions can
grow exponentially if the user starts playing around with ANDs and ORs
for a certain property.

Option 2:
The pseudo query
Find all persons that (like as color (red OR blue) AND green) AND that
(like as material iron)
can straightforward ly be coded in a query using subqueries:
SELECT persid FROM person p WHERE
(EXISTS(SELECT * FROM perscolor pc WHERE pc.colorid=1 AND
p.persid=pc.per sid)
OR
EXISTS(SELECT * FROM perscolor pc WHERE pc.colorid=3 AND
p.persid=pc.per sid))
AND
EXISTS(SELECT * FROM perscolor pc WHERE pc.colorid=2 AND
p.persid=pc.per sid)
AND
EXISTS(SELECT * FROM persmaterial pm WHERE pm.materialid=2 AND
p.persid=pm.per sid)

Both options are quite different. I am no expert but this is my
understanding of the pros and contras of the 2 options:
*option 1:
-difficult to automatically generate the query from the pseudo query
*option 2:
+easy to generate query from pseudo query
-are subqueries not inefficient? If for each subquery you have to go
over the parameters outside the subquery, it means pretty much that for
each subquery you go over the entire person table; I might be wrong and
subqueries might be more optimized but I have no idea on this

Overall, I would like to know
*if there are other options to translate the above pseudo query into an
sql query?
*what of the otpions (proposed + new ones) are best from a performance
point of view?

-- phpMyAdmin SQL Dump
-- version 2.6.1
-- http://www.phpmyadmin.net
--
-- Host: localhost
-- Generation Time: Oct 19, 2006 at 01:13 PM
-- Server version: 4.1.9
-- PHP Version: 4.3.10
--
-- Database: `aston`
--

-- --------------------------------------------------------

--
-- Table structure for table `color`
--

CREATE TABLE `color` (
`colorid` int(11) NOT NULL auto_increment,
`color` varchar(30) NOT NULL default '',
PRIMARY KEY (`colorid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT= 5 ;

--
-- Dumping data for table `color`
--

INSERT INTO `color` VALUES (1, 'red');
INSERT INTO `color` VALUES (2, 'green');
INSERT INTO `color` VALUES (3, 'blue');
INSERT INTO `color` VALUES (4, 'yellow');

-- --------------------------------------------------------

--
-- Table structure for table `material`
--

CREATE TABLE `material` (
`materialid` int(11) NOT NULL auto_increment,
`material` varchar(30) NOT NULL default '',
PRIMARY KEY (`materialid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT= 3 ;

--
-- Dumping data for table `material`
--

INSERT INTO `material` VALUES (1, 'wood');
INSERT INTO `material` VALUES (2, 'iron');

-- --------------------------------------------------------

--
-- Table structure for table `perscolor`
--

CREATE TABLE `perscolor` (
`perscolorid` int(11) NOT NULL auto_increment,
`persid` int(11) NOT NULL default '0',
`colorid` int(11) NOT NULL default '0',
PRIMARY KEY (`perscolorid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT= 7 ;

--
-- Dumping data for table `perscolor`
--

INSERT INTO `perscolor` VALUES (1, 1, 1);
INSERT INTO `perscolor` VALUES (2, 1, 2);
INSERT INTO `perscolor` VALUES (3, 2, 1);
INSERT INTO `perscolor` VALUES (5, 3, 3);
INSERT INTO `perscolor` VALUES (6, 3, 2);

-- --------------------------------------------------------

--
-- Table structure for table `persmaterial`
--

CREATE TABLE `persmaterial` (
`persmatid` int(11) NOT NULL auto_increment,
`persid` int(11) NOT NULL default '0',
`materialid` int(11) NOT NULL default '0',
PRIMARY KEY (`persmatid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT= 6 ;

--
-- Dumping data for table `persmaterial`
--

INSERT INTO `persmaterial` VALUES (1, 1, 1);
INSERT INTO `persmaterial` VALUES (2, 1, 2);
INSERT INTO `persmaterial` VALUES (3, 2, 1);
INSERT INTO `persmaterial` VALUES (5, 3, 2);

-- --------------------------------------------------------

--
-- Table structure for table `person`
--

CREATE TABLE `person` (
`persid` int(11) NOT NULL auto_increment,
`name` varchar(30) NOT NULL default '',
PRIMARY KEY (`persid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT= 4 ;

--
-- Dumping data for table `person`
--

INSERT INTO `person` VALUES (1, 'john');
INSERT INTO `person` VALUES (2, 'emily');
INSERT INTO `person` VALUES (3, 'liz');

Oct 19 '06 #1
0 3298

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
14608
by: Mark Hoffman | last post by:
I'm a newbie at Oracle..Be gentle! I have a table that stores information (WMI data) about computers on our network. The table looks like: ComputerID ItemID Class Property Value
4
2975
by: DBNovice | last post by:
I have a database that keeps records on the issue and failure of an item. Currently, the database is poorly desisned; therefore I'm performing queries to break the data into normalized tables and attempting to perform a "left join" query to build a cross-reference table. The left join query is currently taking nearly 2 hours for MySQL to process, using Navicat as a front-end. My system specs are 1.4Mhz Pentium Processor with 1GB of RAM...
4
2038
by: sheree | last post by:
I have 3 tables (amoung a few others) in a small access database. The tables are as follows: == AEReport -------- AEID (PK) RptCatelog GCRCID PatientID EvntDate
10
15377
by: Marizel | last post by:
I'm not sure there's an easy solution to this, but thought I'd ask. I often find myself with a query which I'd like to reuse, but with a different datasource. These datasources generally have identical field names. The queries select a subset of the fields, so "Select *" is not really an option. Is there an easy way to change the source of a query, either in the design grid or SQL display? I suppose I could copy the SQL into WordPad...
5
1425
by: MattPF | last post by:
I have a table that is -- 30 Megabytes 90,000 rows ~65 columns My query goes SELECT city FROM table WHERE zip = 90210; It will then find about 10 matching records.
4
1412
by: Eric | last post by:
I am trying to get the technician contractor(TECHCONT) of last field technician (FS_TechID or LstVldTech). It could be in Tbl_ValidDispute or Tbl_PPVResearch. In table Tbl_PPVResearch first i have to match the first 5 digits of Account number with Tech_ID.CORP and once it match i have to find last field technician from all three fields(FS_TechID1,FS_TechID2,FS_TechID3). First five digits of account number in both tables are the CORP in...
2
1957
by: teser3 | last post by:
I would like to know if it is possible to do a pattern match or better way to condense my below query in Access 2003. Basically I need to query the average with all fields in TableOne that end with _myWord: Here is what I am using: select ((avg(abcd_myWord) + avg(other_myWord) +
3
3550
by: Henrik Juul | last post by:
I have the following 2 tables: (BATCHES) BatchID KEY ID OrderID Action1DateTime Action2DateTime Action3DateTime Action4DateTime
1
1310
by: jslssuze | last post by:
I have 5 tables in access 1 - depth - normalized 2 - duration - normalized 3 - location - normalized 4 - type - normalized 5 - all data - denormalized with lookups to other tables and full pricing for each scenario I need to figure out how to write the form so that instead of putting data into the table, it is in effect querying the existing table and then some help figuring out how to write the if ... then query
0
9376
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9230
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9148
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
1
6678
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5994
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4499
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4762
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3207
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
2596
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.