473,397 Members | 2,068 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,397 software developers and data experts.

LEFT JOIN? Or create another table?

I am often faced with the dilemma of whether to use a JOIN query across
three tables in order to grab a bunch of results - or whether to create
another table to represent what I want. The latter is less normalised,
but seems less computationally expensive to me(?).

Basicially what I have is:
Friend (uid1,uid2,fid) -> FriendshipCategories(fid,cid) ->
Categories(cid,name)

If I want to find out all the 'categories' (names) which a user has
previously selected.

The fully normalised method would be, presumably, to do a join across
those three tables. However alternatives include creating another table
(let's say "MyCategories") to map my UID to CID, thus allowing me to do
my lookup by simply JOINing two tables. I can also simply include UID
inside "categories" - but thus resulting in repeated entries of what
may well be the same category. I like this idea the least.

The question is - how much slower is it to use the first method
(joining all three) vs. my compromise solution #2.

Any insight would be appreciated!

Thanks,

Terence

Sep 28 '05 #1
2 2641
<te************@gmail.com> wrote in message
news:11**********************@g47g2000cwa.googlegr oups.com...
I am often faced with the dilemma of whether to use a JOIN query across
three tables in order to grab a bunch of results - or whether to create
another table to represent what I want. The latter is less normalised,
but seems less computationally expensive to me(?).
Wrong, wrong, wrong - really wrong!

The most tiresome and worthless excuse in the world for avoiding table
normalization is the bankrupt notion that this will be "computationally
expensive". The whole notion of a relational database is designed around
table joins. Give that up and MySQL becomes a waste of time. MySQL is a
"relational database" and (surprise surprise!) it handles table
relationships with extraordinary efficiency.
Basicially what I have is:
Friend (uid1,uid2,fid) -> FriendshipCategories(fid,cid) ->
Categories(cid,name)

If I want to find out all the 'categories' (names) which a user has
previously selected.

The fully normalised method would be, presumably, to do a join across
those three tables. However alternatives include creating another table
(let's say "MyCategories") to map my UID to CID, thus allowing me to do
my lookup by simply JOINing two tables.
Another, better, alternative would be to try the triple join straight up.
The one you *think* is computationaly inefficient.

Another, better, alternative would be to do the query in stages using
temporary tables. Think of it as a series of table transformations instead
of something you must do in one step.

Another, better, alternative would be to use a subquery - assuming you are
working with version 4.1 or higher and have this facility available.

The second (2 of 3) buys you nothing more than the ability to break up one
more complex query into 2 or more simpler queries. IOW - it makes it easier
for the human being writing the query. MySQL won't care one way or the
other. It eats problems like this for breakfast.
I can also simply include UID
inside "categories" - but thus resulting in repeated entries of what
may well be the same category. I like this idea the least.
I can barely understand what you are trying to arrive at BUT it seems you
are getting into a cross product problem that you can easily avoid with a
GROUP BY clause. There are quite a few smart people hanging about on this
sig that will be glad to show you an efficient query to do exactly what you
want. You are going to have to clarify what you want the results to look
like.

What must your result table contain (exactly!).
What fields are you joining on? What are the data types?
The question is - how much slower is it to use the first method
(joining all three) vs. my compromise solution #2.
No! That's not the question at all.
At this point, you have no idea that your triple join isn't going to be the
fastest possible query that yields the correct result. You owe it to
yourself to try that first before you even bother with other contortions.
Get this first query right and you are likely to find it a waste of time
worrying about computational efficiency. The people who wrote MySQL have
done that for you.

And they did a darn good job of it too!
Thomas Bartkus
Any insight would be appreciated!

Thanks,

Terence

Sep 28 '05 #2
<te************@gmail.com> wrote in message
news:11**********************@g47g2000cwa.googlegr oups.com...
I am often faced with the dilemma of whether to use a JOIN query across
three tables in order to grab a bunch of results - or whether to create
another table to represent what I want. The latter is less normalised,
but seems less computationally expensive to me(?).
Wrong, wrong, wrong - really wrong!

The most tiresome and worthless excuse in the world for avoiding table
normalization is the bankrupt notion that this will be "computationally
expensive". The whole notion of a relational database is designed around
table joins. Give that up and MySQL becomes a waste of time. MySQL is a
"relational database" and (surprise surprise!) it handles table
relationships with extraordinary efficiency.
Basicially what I have is:
Friend (uid1,uid2,fid) -> FriendshipCategories(fid,cid) ->
Categories(cid,name)

If I want to find out all the 'categories' (names) which a user has
previously selected.

The fully normalised method would be, presumably, to do a join across
those three tables. However alternatives include creating another table
(let's say "MyCategories") to map my UID to CID, thus allowing me to do
my lookup by simply JOINing two tables.
Another, better, alternative would be to try the triple join straight up.
The one you *think* is computationaly inefficient.

Another, better, alternative would be to do the query in stages using
temporary tables. Think of it as a series of table transformations instead
of something you must do in one step.

Another, better, alternative would be to use a subquery - assuming you are
working with version 4.1 or higher and have this facility available.

The second (2 of 3) buys you nothing more than the ability to break up one
more complex query into 2 or more simpler queries. IOW - it makes it easier
for the human being writing the query. MySQL won't care one way or the
other. It eats problems like this for breakfast.
I can also simply include UID
inside "categories" - but thus resulting in repeated entries of what
may well be the same category. I like this idea the least.
I can barely understand what you are trying to arrive at BUT it seems you
are getting into a cross product problem that you can easily avoid with a
GROUP BY clause. There are quite a few smart people hanging about on this
sig that will be glad to show you an efficient query to do exactly what you
want. You are going to have to clarify what you want the results to look
like.

What must your result table contain (exactly!).
What fields are you joining on? What are the data types?
The question is - how much slower is it to use the first method
(joining all three) vs. my compromise solution #2.
No! That's not the question at all.
At this point, you have no idea that your triple join isn't going to be the
fastest possible query that yields the correct result. You owe it to
yourself to try that first before you even bother with other contortions.
Get this first query right and you are likely to find it a waste of time
worrying about computational efficiency. The people who wrote MySQL have
done that for you.

And they did a darn good job of it too!
Thomas Bartkus
Any insight would be appreciated!

Thanks,

Terence


Sep 28 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

13
by: StealthBananaT | last post by:
My database has two tables - films has 10,000 records and reviews has 20,000 records. Whenever I try to list all the films and the count of its reviews, MySQL locks and I have to restart the...
0
by: Marek Lewczuk | last post by:
Hello, I have a strange problem, maybe some of you will be able to explain me something. I use LEFT JOIN as a substitute for subselects. It's true that many subselects can be rewriten using LEFT...
0
by: Soefara | last post by:
Dear Sirs, I am experiencing strange results when trying to optimize a LEFT JOIN on 3 tables using MySQL. Given 3 tables A, B, C such as the following: create table A ( uniqueId int not...
5
by: Marek Kotowski | last post by:
In MySQL online documentation there are some examples with multi-tables left joins. But all of them are like this (taken from the documentation): SELECT ... FROM table1 LEFT JOIN table2 on...
3
by: sks | last post by:
I have a Product table, a Categories table and a join table that contains product to category mappings (each product can be in many categories) CREATE TABLE categories ( id bigint(20) unsigned...
1
by: Paul Bramscher | last post by:
Here's one for pathological SQL programmers. I've got a table of things called elements. They're components, sort of like amino acids, which come together to form complex web pages -- as nodes...
7
by: Steve | last post by:
I have a SQL query I'm invoking via VB6 & ADO 2.8, that requires three "Left Outer Joins" in order to return every transaction for a specific set of criteria. Using three "Left Outer Joins"...
3
by: Ian Boyd | last post by:
i know nothing about DB2, but i'm sure this must be possible. i'm trying to get a client to create a view (which it turns out is called a "Logical" in DB2). The query needs a LEFT OUTER JOIN, but...
9
by: shanevanle | last post by:
I have two tables that are pretty big. I need about 10 rows in the left table and the right table is filtered to 5 rows as well. It seems when I join the tables in the FROM clause, I have to...
3
by: rrstudio2 | last post by:
If I have two tables and need to do a left outer join and include a where statement on the second table, it seems like the left outer join becomes an inner join. For example: Table: Names id...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.