472,127 Members | 1,751 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,127 software developers and data experts.

"Users shopping for A were also interested in B"

How should I set up a database to be able to efficiently maintain
associations between related items?

Example: Users shopping for Lord Of The Rings trilogy were also
interested in The Hobbit.

There will be many (20000+) items for sale and I need to do this
efficiently, but I don't have an idea how could I do it.


Jul 23 '05 #1
4 1006

David Portas
SQL Server MVP

Jul 23 '05 #2
Yeah but how to create these associations efficiently in the first

Jul 23 '05 #3
This called a relational division, and you can Google it. You want one
with a remainder for doing "shopping baskets".

Having said this, you are better off looking for a data mining tool
that will have the ability to set up the baskets on the fly for
somethign like Amazon.

Assuming you only want to have pairs in the baskets and do this only
once and not on the fly, you can set up a sparse matrix in SQL:

(item_1_upc CHAR(13) NOT NULL,
item_2_upc CHAR(13) NOT NULL,
CHECK (item_1_upc < item_2_upc) ,
PRIMARY KEY (item_1_upc, item_2_upc));

Now go thru every order, pull out all possible pairs that occur more
than some threshold:

INSERT INTO Pairs(item_1_upc, item_2_upc, tally)
SELECT O1.upc, O2.upc, COUNT(*)
FROM Orders AS O1, Orders AS O2
WHERE O1.order_nbr = O2.order_nbr
AND O1.upc < O2.upc
GROUP BY O1.upc, O2.upc
HAVING COUNT(*) > @some_limit;
From the Pairs matrix, you can then build Triplets with whatever rules

you wish based on the tally. For example, given (a, b, 123) and
(b,c, 100) you know that (a,b,c, ??) cannot have more than 100
occurences, but if I have 1000 b's, the number could be as low as zero.

Jul 23 '05 #4
The first query in the reply I posted does that ("efficiently" is moot
I suppose). If you need something different then please post DDL and
sample data so that we can understand what you need.

In reality, I think most e-tailers would not want to automate the
link-selling process. I suspect the results would be unpredictable and
I doubt that the major sites ever do it that way. Careful review and
selection of the linked products by the sales and marketing team will
probably give a better return.

David Portas
SQL Server MVP

Jul 23 '05 #5

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

27 posts views Thread by Ron Adam | last post: by
2 posts views Thread by Tim Constantine | last post: by
2 posts views Thread by jwalton | last post: by
4 posts views Thread by seansan | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.