473,385 Members | 1,355 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

SELECT DISTINCT with JOIN

Hi everyone

Have a problem I would areally appreciate help with.
I have 3 tables in a standard format for a Bookshop, eg

Products
Categories
Categories_Products

the latter allowing me to have products in multiple categories.

Everthing works well except for one annoying little thing.

When an individual product (which is in more than one topcategory) is added
to the Shopping Cart it displays twice, because in my select statement I
have the Category listed. I realise I could remove the TopCategory from the
statement and that makes my DISTINCT work as I wanted, but Id prefer to have
the TopCategory as it saves me later having to another SQL query (Im already
doing one to allow me not to list category in the Statement .... but If I
can overcome this one ... then I can remove this as well).

Here is my table structure (the necessary bits)
products
idProduct int
....

categories
idcategory int
idParentCategory int
topcategory int
...

categories_products
idCatProd int
idProduct int
idCategory

When I run a query such as

SELECT DISTINCT a.idProduct, a.description,a.descriptionLong,
a.listPrice,a.price,a.smallImageUrl,a.stock, a.fileName,a.noShipCharge,
c.topcategory
FROM products a, categories_products b, categories c
WHERE active = -1 AND homePage = -1
AND a.idProduct = b.idProduct
AND c.idcategory=b.idcategory
AND prodType = 1 ORDER BY a.idProduct DESC

This will return all products as expected, as well as any products which are
in more than one TopCategory.

Any ideas how to overcome this would be greatly appreciated.

Cheers

Craig
Jul 20 '05 #1
14 45432
On Fri, 14 May 2004 09:05:08 +1200, Craig Hoskin wrote:
Hi everyone

Have a problem I would areally appreciate help with.
I have 3 tables in a standard format for a Bookshop, eg

Products
Categories
Categories_Products

the latter allowing me to have products in multiple categories.

Everthing works well except for one annoying little thing.

When an individual product (which is in more than one topcategory) is added
to the Shopping Cart it displays twice, because in my select statement I
have the Category listed. I realise I could remove the TopCategory from the
statement and that makes my DISTINCT work as I wanted, but Id prefer to have
the TopCategory as it saves me later having to another SQL query (Im already
doing one to allow me not to list category in the Statement .... but If I
can overcome this one ... then I can remove this as well).

Here is my table structure (the necessary bits)
products
idProduct int
....

categories
idcategory int
idParentCategory int
topcategory int
...

categories_products
idCatProd int
idProduct int
idCategory

When I run a query such as

SELECT DISTINCT a.idProduct, a.description,a.descriptionLong,
a.listPrice,a.price,a.smallImageUrl,a.stock, a.fileName,a.noShipCharge,
c.topcategory
FROM products a, categories_products b, categories c
WHERE active = -1 AND homePage = -1
AND a.idProduct = b.idProduct
AND c.idcategory=b.idcategory
AND prodType = 1 ORDER BY a.idProduct DESC

This will return all products as expected, as well as any products which are
in more than one TopCategory.

Any ideas how to overcome this would be greatly appreciated.

Cheers

Craig


Hi Craig,

The first idea will have to come from you. SQL Server won't make your
choices for you. So you must decide *which* TopCategory to display for
products that are in more than one TopCategory.

If you can describe how to choose, I (or someone else in this NG) will
probably be able to help writing up the query.

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)
Jul 20 '05 #2
Doooh

Always the way, you post a question and find a possible solution :-)

Would this work?

select a.idProduct, a.description,a.descriptionLong,
a.listPrice,a.price,a.smallImageUrl,a.stock, a.fileName,a.noShipCharge
from Products a
where exists (select * from categories c, categories_products b where
a.idProduct = b.idProduct AND c.idcategory=b.idcategory)
AND a.active = -1 AND a.homePage = -1 AND a.prodType = 1
ORDER BY a.idProduct DESC

Cheers

Craig
Hi everyone

Have a problem I would areally appreciate help with.
I have 3 tables in a standard format for a Bookshop, eg

Products
Categories
Categories_Products

the latter allowing me to have products in multiple categories.

Everthing works well except for one annoying little thing.

When an individual product (which is in more than one topcategory) is added to the Shopping Cart it displays twice, because in my select statement I
have the Category listed. I realise I could remove the TopCategory from the statement and that makes my DISTINCT work as I wanted, but Id prefer to have the TopCategory as it saves me later having to another SQL query (Im already doing one to allow me not to list category in the Statement .... but If I
can overcome this one ... then I can remove this as well).

Here is my table structure (the necessary bits)
products
idProduct int
....

categories
idcategory int
idParentCategory int
topcategory int
...

categories_products
idCatProd int
idProduct int
idCategory

When I run a query such as

SELECT DISTINCT a.idProduct, a.description,a.descriptionLong,
a.listPrice,a.price,a.smallImageUrl,a.stock, a.fileName,a.noShipCharge,
c.topcategory
FROM products a, categories_products b, categories c
WHERE active = -1 AND homePage = -1
AND a.idProduct = b.idProduct
AND c.idcategory=b.idcategory
AND prodType = 1 ORDER BY a.idProduct DESC

This will return all products as expected, as well as any products which are in more than one TopCategory.

Any ideas how to overcome this would be greatly appreciated.

Cheers

Craig

Jul 20 '05 #3

"Hugo Kornelis" <hugo@pe_NO_rFact.in_SPAM_fo> wrote in message
news:7u********************************@4ax.com...
Hi Hugo

The first idea will have to come from you. SQL Server won't make your
choices for you. So you must decide *which* TopCategory to display for
products that are in more than one TopCategory.
Sorry, yes I should have calrified that aspect. Answer: I dont care :-)
Any topcategory will do as its still the same product, and the produut
display page will actually list all the categories its in.

Thanks for taking the time to help me.

Cheers

Craig
If you can describe how to choose, I (or someone else in this NG) will
probably be able to help writing up the query.

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)

Jul 20 '05 #4
No cancel that ... I forgot my c.topcategory in the qeruy.

Back to seeking help again :-)
Jul 20 '05 #5
On Fri, 14 May 2004 09:30:00 +1200, Craig Hoskin wrote:

"Hugo Kornelis" <hugo@pe_NO_rFact.in_SPAM_fo> wrote in message
news:7u********************************@4ax.com.. .
Hi Hugo

The first idea will have to come from you. SQL Server won't make your
choices for you. So you must decide *which* TopCategory to display for
products that are in more than one TopCategory.


Sorry, yes I should have calrified that aspect. Answer: I dont care :-)
Any topcategory will do as its still the same product, and the produut
display page will actually list all the categories its in.

Thanks for taking the time to help me.

Cheers

Craig
If you can describe how to choose, I (or someone else in this NG) will
probably be able to help writing up the query.


Hi Craig,

Hmmmm. Since SQL Server still won't choose for you, allow me to do it for
you. The following two queries will select the "first" topcategory (I
don't know the datatype, so it will either be the lowest numbered, the one
that comes first in the alphabet or the oldest).

1: Using GROUP BY

SELECT a.idProduct, a.description, a.descriptionLong,
a.listPrice, a.price, a.smallImageUrl, a.stock,
a.fileName, a.noShipCharge, MIN(c.topcategory)
FROM products a, categories_products b, categories c
WHERE a.active = -1
AND a.homePage = -1
AND a.idProduct = b.idProduct
AND c.idcategory = b.idcategory
AND a.prodType = 1
GROUP BY a.idProduct, a.description, a.descriptionLong,
a.listPrice, a.price, a.smallImageUrl, a.stock,
a.fileName, a.noShipCharge
ORDER BY a.idProduct DESC
2: Using a subquery.

SELECT a.idProduct, a.description, a.descriptionLong,
a.listPrice, a.price, a.smallImageUrl, a.stock,
a.fileName, a.noShipCharge,
(SELECT MIN(c.topcategory)
FROM categories_products b, categories c
WHERE a.idProduct = b.idProduct
AND c.idcategory = b.idcategory)
FROM products a
WHERE a.active = -1
AND a.homePage = -1
AND a.prodType = 1
ORDER BY a.idProduct DESC

Both queries are untested. Test them both and check the results. Then
compare the execution plan to see which is most efficient with your data.
If you see little difference, pick the one you like best :-)

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)
Jul 20 '05 #6
>
If you can describe how to choose, I (or someone else in this NG) will
probably be able to help writing up the query.


Eg max(topcategory) or something / somehow?
Jul 20 '05 #7
On Fri, 14 May 2004 09:33:15 +1200, Craig Hoskin wrote:
No cancel that ... I forgot my c.topcategory in the qeruy.

Back to seeking help again :-)


Hi Craig,

So I noted. But it still was a useful post, as it showed that the active,
homepage and product columns are all in the Products table (something that
was not clear from your first post). Without this knowledge, I could not
have presented my second option (see my other post).

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)
Jul 20 '05 #8
Craig Hoskin (no****@infobahn.co.nz) writes:
Have a problem I would areally appreciate help with.
I have 3 tables in a standard format for a Bookshop, eg

Products
Categories
Categories_Products

the latter allowing me to have products in multiple categories.

Everthing works well except for one annoying little thing.

When an individual product (which is in more than one topcategory) is
added to the Shopping Cart it displays twice, because in my select
statement I have the Category listed. I realise I could remove the
TopCategory from the statement and that makes my DISTINCT work as I
wanted, but Id prefer to have the TopCategory as it saves me later
having to another SQL query (Im already doing one to allow me not to
list category in the Statement .... but If I can overcome this one ...
then I can remove this as well).


You want to display a distinct list, but you only want one TopCategory.
So which one do you want?

Well, assuming you only want the highest, you could do:

SELECT a.idProduct, a.description,a.descriptionLong,
a.listPrice,a.price,a.smallImageUrl,a.stock, a.fileName,a.noShipCharge,
MAX(c.topcategory)
FROM products a, categories_products b, categories c
WHERE active = -1 AND homePage = -1
AND a.idProduct = b.idProduct
AND c.idcategory=b.idcategory
AND prodType = 1 ORDER BY a.idProduct DESC
GROUP BY a.idProduct, a.description,a.descriptionLong,
a.listPrice,a.price,a.smallImageUrl,a.stock,
a.fileName,a.noShipCharge

--
Erland Sommarskog, SQL Server MVP, so****@algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #9
Hi Hugo

Ive tried this which seems to work?

SELECT DISTINCT a.idProduct, a.description,a.descriptionLong,
a.listPrice,a.price,a.smallImageUrl,a.stock, a.fileName,a.noShipCharge,
c.topcategory
FROM products a, categories_products b, categories c
WHERE c.topcategory = (SELECT MAX(c.topcategory)
FROM categories_products b, categories c
WHERE a.idProduct = b.idProduct
AND c.idcategory = b.idcategory)
AND active = -1 AND homePage = -1 A
ND a.idProduct = b.idProduct AND c.idcategory=b.idcategory
AND prodType = 1
ORDER BY a.idProduct DESC
"Hugo Kornelis" <hugo@pe_NO_rFact.in_SPAM_fo> wrote in message
news:7u********************************@4ax.com...

The first idea will have to come from you. SQL Server won't make your
choices for you. So you must decide *which* TopCategory to display for
products that are in more than one TopCategory.

If you can describe how to choose, I (or someone else in this NG) will
probably be able to help writing up the query.

Best, Hugo

Jul 20 '05 #10
Hi Hugo

Hmmmm. Since SQL Server still won't choose for you, allow me to do it for
you. The following two queries will select the "first" topcategory (I
don't know the datatype, so it will either be the lowest numbered, the one
that comes first in the alphabet or the oldest).
Thanks for that ... it was your previous posting that gave me an idea about
the MAX. Good plan :-)

2: Using a subquery.

SELECT a.idProduct, a.description, a.descriptionLong,
a.listPrice, a.price, a.smallImageUrl, a.stock,
a.fileName, a.noShipCharge,
(SELECT MIN(c.topcategory)
FROM categories_products b, categories c
WHERE a.idProduct = b.idProduct
AND c.idcategory = b.idcategory)
FROM products a
WHERE a.active = -1
AND a.homePage = -1
AND a.prodType = 1
ORDER BY a.idProduct DESC

Both queries are untested. Test them both and check the results. Then
compare the execution plan to see which is most efficient with your data.
If you see little difference, pick the one you like best :-)


This last one looked the best.

From a cursory glance, which would be better, yours where the SELECT is a
subquery (forgove my terminiology) or mine where it is in the WHERE clause?

Both provide the same results (it appears) and appear OK performace wise ...
just wondering which is better for a larger number of records?

Thanks again for all your help ... truly very much appreciated.

Cheers

Craig
Jul 20 '05 #11
Hi Erland

Thanks for the reply.

I note you and Hugo both placed the "select Max()" in the SELECT part, as
opposed to one I made where I placed it in the WHERE clause.

Am I correct in assuming that, given you guys and your experience, that
yours will be more efficient ... since you both seem to have choosen the
same approach?

Cheers

Craig

"Erland Sommarskog" <so****@algonet.se> wrote in message
news:Xn**********************@127.0.0.1...
Craig Hoskin (no****@infobahn.co.nz) writes:
Have a problem I would areally appreciate help with.
I have 3 tables in a standard format for a Bookshop, eg

Products
Categories
Categories_Products

the latter allowing me to have products in multiple categories.

Everthing works well except for one annoying little thing.

When an individual product (which is in more than one topcategory) is
added to the Shopping Cart it displays twice, because in my select
statement I have the Category listed. I realise I could remove the
TopCategory from the statement and that makes my DISTINCT work as I
wanted, but Id prefer to have the TopCategory as it saves me later
having to another SQL query (Im already doing one to allow me not to
list category in the Statement .... but If I can overcome this one ...
then I can remove this as well).


You want to display a distinct list, but you only want one TopCategory.
So which one do you want?

Well, assuming you only want the highest, you could do:

SELECT a.idProduct, a.description,a.descriptionLong,
a.listPrice,a.price,a.smallImageUrl,a.stock, a.fileName,a.noShipCharge,
MAX(c.topcategory)
FROM products a, categories_products b, categories c
WHERE active = -1 AND homePage = -1
AND a.idProduct = b.idProduct
AND c.idcategory=b.idcategory
AND prodType = 1 ORDER BY a.idProduct DESC
GROUP BY a.idProduct, a.description,a.descriptionLong,
a.listPrice,a.price,a.smallImageUrl,a.stock,
a.fileName,a.noShipCharge

--
Erland Sommarskog, SQL Server MVP, so****@algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Jul 20 '05 #12
On Fri, 14 May 2004 10:08:16 +1200, Craig Hoskin wrote:

(snip)
This last one looked the best.

From a cursory glance, which would be better, yours where the SELECT is a
subquery (forgove my terminiology) or mine where it is in the WHERE clause?
Hi Craig,

Your query seems to contain some redundant joins. This may not hurt you if
the optimizer "sees" that it doesn't need to perform the join twice, but
I'm not sure if the optimizer is intelligent enough for that.

Both provide the same results (it appears) and appear OK performace wise ...
just wondering which is better for a larger number of records?


Without testing, I'd say mine is quicker (probably) or they have equal
performance (slight chance). But you'd have to test to be sure. Make sure
you have an adequate (ie large) set of test data with as typical
distribution as possible. Run both queries with SET STATISTICS IO. Also,
compare the execution plans.

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)
Jul 20 '05 #13
On Fri, 14 May 2004 00:48:02 +0200, Hugo Kornelis wrote:
On Fri, 14 May 2004 10:08:16 +1200, Craig Hoskin wrote:

(snip)
This last one looked the best.

From a cursory glance, which would be better, yours where the SELECT is a
subquery (forgove my terminiology) or mine where it is in the WHERE clause?


Hi Craig,

Your query seems to contain some redundant joins. This may not hurt you if
the optimizer "sees" that it doesn't need to perform the join twice, but
I'm not sure if the optimizer is intelligent enough for that.


Forgot to mention - my first suggestion (identical to Erland's) has a high
probability of being the fastest of them all (though it does depend on
what indexes are available for the optimizer to toy with).

I suspect the long group by list is what makes this one "look" less good
to you. However, this will probably look better to the optimizer than the
correlated subqueries in my second query and in your post. Also, do keep
in mind that you can mentally replace the entire group by list by "all
columns in the select clause except topcategory".

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)
Jul 20 '05 #14
Craig Hoskin (no****@infobahn.co.nz) writes:
I note you and Hugo both placed the "select Max()" in the SELECT part, as
opposed to one I made where I placed it in the WHERE clause.

Am I correct in assuming that, given you guys and your experience, that
yours will be more efficient ... since you both seem to have choosen the
same approach?


For this case it is probably more effecient. It was also less typing...

--
Erland Sommarskog, SQL Server MVP, so****@algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #15

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Colman | last post by:
Howdy all! I guess I'm a newbie, because I am stumped (or maybe just too durned tired). Here's what I got... CREATE TABLE `nodecat_map` ( `nodecat_id` mediumint(8) unsigned NOT NULL...
5
by: Reestit Mutton | last post by:
Hi, I'm currently learning the ropes with CSS, PHP, MYSQL, Javascript etc... by redesigning my website as a database driven site. Okay, so I'm skilled at perl, data manipulation and data...
6
by: RCS | last post by:
I've been running into more and more complexity with an application, because as time goes on - we need more and more high-level, rolled-up information. And so I've created views, and views that use...
3
by: Tcs | last post by:
My backend is DB2 on our AS/400. While I do HAVE DB2 PE for my PC, I haven't loaded it yet. I'm still using MS Access. And no, I don't believe this is an Access question. (But who knows? I...
23
by: Brian | last post by:
Hello All - I am wondering if anyone has any thoughts on which is better from a performance perspective: a nested Select statement or an Inner Join. For example, I could do either of the...
9
by: Kelvin | last post by:
Okay so this is baking my noodle. I want to select all the attritbutes/fields from a table but then to excluded any row in which a single attributes data has been duplicated. I.E. Here's my...
2
by: Andrew Lias | last post by:
Let us say that I have a table with two sets of values as such: Item Extension --- ---- 100023 1 100025 1 100025 2 100028 1 100029 1 100029 2
6
by: Bob Stearns | last post by:
I am getting duplicate rows back from a select distinct statement of the form: SELECT DISTINCT 'jhough', '000111', t0.bhid FROM (SELECT lots of good stuff) t0 LEFT OUTER JOIN another_table ...
22
by: MP | last post by:
vb6,ado,mdb,win2k i pass the sql string to the .Execute method on the open connection to Table_Name(const) db table fwiw (the connection opened via class wrapper:) msConnString = "Data Source="...
4
tjc0ol
by: tjc0ol | last post by:
Hi guys, I'm a newbie in php and I got error in my index.php which is: 1054 - Unknown column 'p.products_id' in 'on clause' select p.products_image, pd.products_name, p.products_id,...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.