By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,665 Members | 1,382 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,665 IT Pros & Developers. It's quick & easy.

LEFT JOIN? Or create another table?

P: n/a
I am often faced with the dilemma of whether to use a JOIN query across
three tables in order to grab a bunch of results - or whether to create
another table to represent what I want. The latter is less normalised,
but seems less computationally expensive to me(?).

Basicially what I have is:
Friend (uid1,uid2,fid) -> FriendshipCategories(fid,cid) ->
Categories(cid,name)

If I want to find out all the 'categories' (names) which a user has
previously selected.

The fully normalised method would be, presumably, to do a join across
those three tables. However alternatives include creating another table
(let's say "MyCategories") to map my UID to CID, thus allowing me to do
my lookup by simply JOINing two tables. I can also simply include UID
inside "categories" - but thus resulting in repeated entries of what
may well be the same category. I like this idea the least.

The question is - how much slower is it to use the first method
(joining all three) vs. my compromise solution #2.

Any insight would be appreciated!

Thanks,

Terence

Sep 28 '05 #1
Share this Question
Share on Google+
2 Replies


P: n/a
<te************@gmail.com> wrote in message
news:11**********************@g47g2000cwa.googlegr oups.com...
I am often faced with the dilemma of whether to use a JOIN query across
three tables in order to grab a bunch of results - or whether to create
another table to represent what I want. The latter is less normalised,
but seems less computationally expensive to me(?).
Wrong, wrong, wrong - really wrong!

The most tiresome and worthless excuse in the world for avoiding table
normalization is the bankrupt notion that this will be "computationally
expensive". The whole notion of a relational database is designed around
table joins. Give that up and MySQL becomes a waste of time. MySQL is a
"relational database" and (surprise surprise!) it handles table
relationships with extraordinary efficiency.
Basicially what I have is:
Friend (uid1,uid2,fid) -> FriendshipCategories(fid,cid) ->
Categories(cid,name)

If I want to find out all the 'categories' (names) which a user has
previously selected.

The fully normalised method would be, presumably, to do a join across
those three tables. However alternatives include creating another table
(let's say "MyCategories") to map my UID to CID, thus allowing me to do
my lookup by simply JOINing two tables.
Another, better, alternative would be to try the triple join straight up.
The one you *think* is computationaly inefficient.

Another, better, alternative would be to do the query in stages using
temporary tables. Think of it as a series of table transformations instead
of something you must do in one step.

Another, better, alternative would be to use a subquery - assuming you are
working with version 4.1 or higher and have this facility available.

The second (2 of 3) buys you nothing more than the ability to break up one
more complex query into 2 or more simpler queries. IOW - it makes it easier
for the human being writing the query. MySQL won't care one way or the
other. It eats problems like this for breakfast.
I can also simply include UID
inside "categories" - but thus resulting in repeated entries of what
may well be the same category. I like this idea the least.
I can barely understand what you are trying to arrive at BUT it seems you
are getting into a cross product problem that you can easily avoid with a
GROUP BY clause. There are quite a few smart people hanging about on this
sig that will be glad to show you an efficient query to do exactly what you
want. You are going to have to clarify what you want the results to look
like.

What must your result table contain (exactly!).
What fields are you joining on? What are the data types?
The question is - how much slower is it to use the first method
(joining all three) vs. my compromise solution #2.
No! That's not the question at all.
At this point, you have no idea that your triple join isn't going to be the
fastest possible query that yields the correct result. You owe it to
yourself to try that first before you even bother with other contortions.
Get this first query right and you are likely to find it a waste of time
worrying about computational efficiency. The people who wrote MySQL have
done that for you.

And they did a darn good job of it too!
Thomas Bartkus
Any insight would be appreciated!

Thanks,

Terence

Sep 28 '05 #2

P: n/a
<te************@gmail.com> wrote in message
news:11**********************@g47g2000cwa.googlegr oups.com...
I am often faced with the dilemma of whether to use a JOIN query across
three tables in order to grab a bunch of results - or whether to create
another table to represent what I want. The latter is less normalised,
but seems less computationally expensive to me(?).
Wrong, wrong, wrong - really wrong!

The most tiresome and worthless excuse in the world for avoiding table
normalization is the bankrupt notion that this will be "computationally
expensive". The whole notion of a relational database is designed around
table joins. Give that up and MySQL becomes a waste of time. MySQL is a
"relational database" and (surprise surprise!) it handles table
relationships with extraordinary efficiency.
Basicially what I have is:
Friend (uid1,uid2,fid) -> FriendshipCategories(fid,cid) ->
Categories(cid,name)

If I want to find out all the 'categories' (names) which a user has
previously selected.

The fully normalised method would be, presumably, to do a join across
those three tables. However alternatives include creating another table
(let's say "MyCategories") to map my UID to CID, thus allowing me to do
my lookup by simply JOINing two tables.
Another, better, alternative would be to try the triple join straight up.
The one you *think* is computationaly inefficient.

Another, better, alternative would be to do the query in stages using
temporary tables. Think of it as a series of table transformations instead
of something you must do in one step.

Another, better, alternative would be to use a subquery - assuming you are
working with version 4.1 or higher and have this facility available.

The second (2 of 3) buys you nothing more than the ability to break up one
more complex query into 2 or more simpler queries. IOW - it makes it easier
for the human being writing the query. MySQL won't care one way or the
other. It eats problems like this for breakfast.
I can also simply include UID
inside "categories" - but thus resulting in repeated entries of what
may well be the same category. I like this idea the least.
I can barely understand what you are trying to arrive at BUT it seems you
are getting into a cross product problem that you can easily avoid with a
GROUP BY clause. There are quite a few smart people hanging about on this
sig that will be glad to show you an efficient query to do exactly what you
want. You are going to have to clarify what you want the results to look
like.

What must your result table contain (exactly!).
What fields are you joining on? What are the data types?
The question is - how much slower is it to use the first method
(joining all three) vs. my compromise solution #2.
No! That's not the question at all.
At this point, you have no idea that your triple join isn't going to be the
fastest possible query that yields the correct result. You owe it to
yourself to try that first before you even bother with other contortions.
Get this first query right and you are likely to find it a waste of time
worrying about computational efficiency. The people who wrote MySQL have
done that for you.

And they did a darn good job of it too!
Thomas Bartkus
Any insight would be appreciated!

Thanks,

Terence


Sep 28 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.