473,666 Members | 2,386 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Mammoth amounts of self joins

sks
I have a table that contains keywords (Varchars) each one mapped to a
product. so the database schema is just an id column, product column and
keyword column.

I want to select the products that contain x,y,z keywords. Now if this query
involves many keywords I end up with a massive amount of self joins on the
keywords table, is there a better way to do this than self joins? What would
work perhaps is multiple unions where I could discard non duplicated rows
but you can't seem to do that. Any ideas... ?
Mar 22 '06 #1
3 1517
sks wrote:
I have a table that contains keywords (Varchars) each one mapped to a
product. so the database schema is just an id column, product column and
keyword column. I want to select the products that contain x,y,z keywords. Now if this query
involves many keywords I end up with a massive amount of self joins on the
keywords table, is there a better way to do this than self joins? What would
work perhaps is multiple unions where I could discard non duplicated rows
but you can't seem to do that. Any ideas... ?


Show us the queries you have come up with and why they don't work and
provide some sample data (does not have to be real data).

Is there more than one table involved?

What do you consider "many keywords"?

Why are you doing "self-joins"?

Do the keywords map to more than one product?

Is this for a school project or some job that you have?

Mar 22 '06 #2
sks

"noone" <no***@nowhere. com> wrote in message
news:f2******** *************** *********@www.f irstdbasource.c om...
sks wrote:
I have a table that contains keywords (Varchars) each one mapped to a
product. so the database schema is just an id column, product column and
keyword column.

I want to select the products that contain x,y,z keywords. Now if this
query
involves many keywords I end up with a massive amount of self joins on
the
keywords table, is there a better way to do this than self joins? What
would
work perhaps is multiple unions where I could discard non duplicated rows
but you can't seem to do that. Any ideas... ?


Show us the queries you have come up with and why they don't work and
provide some sample data (does not have to be real data).
Is there more than one table involved?
What do you consider "many keywords"?

Why are you doing "self-joins"?
Do the keywords map to more than one product?
Is this for a school project or some job that you have?


It's for a software package I write.

The keywords are used as meta data about an item. Each of the keywords is
for a particular attribute that is relevant for certain items only. This is
all definable by the user at runtime. Let's say we had a site selling DVDs,
the attributes in this case might be Director, Actor, and Genre. Each of
these attributes might have multiple values (or none).

Here is a simple schema to demonstrate

Items
--------------
id
name

Attributes
----------
id
name

Attributes_Valu es
----------
id
item
value

I might want to find all DVDs that had Clint Eastwood, or Tommy Lee Jones,
or Richard Burton in as Actor, and where Comedies, Action or Drama in Genre.

I could self join like this

select distinct i.* from item i join attributes_valu es av1 on i.id=av1.item
join attributes_valu es av2 on i.id=av2.item where (av1.value='Cli nt
Eastwood' or av1.value='Tomm y Lee Jones' ......) and av2.value='Come dy' or
av2.value='Dram a' ....) where i.status='Live' .... (other item related
clauses here).

This runs extremely slowly when I have more than a few attributes. 9 or 10
is quite common on some items and then you could be searching for 5 to 10
values in each attribute.

My faster solution (and the one I use ) is to select the item column from
the attribute_value s table and union it for each different attribute I am
searching against, then group that result and select from it where count is
equal to the number of attributes I searched for and then join that against
the item table. It's fast but I think ugly.

I am wondering if there is an obvious solution I have missed.
Mar 24 '06 #3
"sks" <sk*@privacy.me > wrote in message
news:GI******** *************** *******@eclipse .net.uk...
My faster solution (and the one I use ) is to select the item column from
the attribute_value s table and union it for each different attribute I am
searching against, then group that result and select from it where count
is equal to the number of attributes I searched for and then join that
against the item table. It's fast but I think ugly.


This solution also scales much better than the self-join method. Keep in
mind that MySQL has a limit of 31 joined tables per query (or 63 if you use
a 64-bit operating system). It's likely that a 31-way join performs very
poorly.

The Entity-Attribute-Value data model (or EAV) is hard to use right, and it
violates a few principles of relational modeling and normalization.

See also my reply to today's thread with subject, "Joining data from 2
fields into the first one".

Regards,
Bill K.
Mar 24 '06 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
1371
by: Denis St-Michel | last post by:
Hello All, Hope some Guru will be able to help me with this. Let's take this example table A ------------------------------------------------------------------------------- id | TicketNo | evaluation | Username ------------------------------------------------------------------------------- 1 1 9 Jamie 2 1 8.5 ...
1
2791
by: nuked | last post by:
I have a table that has values of variables for certain entities. The columns of interest are targetID, variableID, and valueID. A row (1, 5, 9) means that target number 1 has a value of 9 for variable 5. Being denormalized, target number one will have many possible rows in this table, one for each variable for which it has a value. My problem occurs when I want to find out what targets match a certain set of variable values. For...
7
9209
by: Andrew Mayo | last post by:
Here's a really weird one for any SQL Server gurus out there... We have observed (SQL Server 2000) scenarios where a stored procedure which (a) begins a transaction (b) inserts some rows into a table (c) re-queries another table using a subquery which references the inserted table (correlated or not)
4
4100
by: jbm05 | last post by:
Hi, I'm curious about the computational complexity of a query I have. The query contains multiple nested self left joins, starting with a simple select, then doing a self left join with the results, then doing a self left join with those results, etc. What puzzles me is that the time required for the query seems to grow exponentially as I add additional left joins, which I didn't expect. I expected the inner select to return about 25...
10
3242
by: Rada Chirkova | last post by:
Hi, at NC State University, my students and I are working on a project called "self-organizing databases," please see description below. I would like to use an open-source database system for implementation and would really appreciate your opinion on whether PostgreSQL is suitable for the project. In general, I am very impressed by the quality of PostgreSQL code and documentation, as well as by the support of the developer community. ...
5
7247
by: Julia Baresch | last post by:
Hi everyone, I haven't found any reference to this problem on this group or in Access 97 help. I hope someone here knows this - probably a simple thing, but not obvious. I designed a query to pull payment data from my database for accounting. It has 3 tables with one-to-many joins: Table1 --> one-to-many --> Table2 --> one-to-many --> Table3 The payment amount field is in Table3. I entered a few negative
6
8615
by: John E | last post by:
I have a question about how to make records in a table reference other records in the same table.... Suppose I'm building a small database to track some basic information on several companies. In addition to the ID and other info, I want to track predecessor and succesor companies, if any. In other words, a given company may have come into being from the merger of two or more predecessors, may have been spun-off from a single...
6
6296
by: davegb | last post by:
I'm trying to create a self-join table to show the relationship between employee and supervisor. In another thread, I was advised to create a SupervisorID in the employee table, a separate Supervisor table, and join the Supervisor table to the Employee table and a copy of the Supervisor table to create the self-join. I can't figure out how to do this from reading Viescas or from researching it here. I can start a query and create 2 copies...
2
1783
by: Darragh | last post by:
Hi all! I'm having a bit of an issue making a self join in access (similar to the example explained on Allen Browne's excellent site - http://allenbrowne.com/ser-06.html). I've made the self join in the query window, however when I've tried to run the query i get the message "table t1 opened exclusively by another user, or it is already open through the user interface and cannot be manipulated programmatically"
0
8866
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8638
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
7381
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6191
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
4193
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4365
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2769
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
2006
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
2
1769
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.