Self join to multiple references...

Christopher Harrison

Is there a way to store an indefinite number of keys in one field and
self join them? Imagine, for example, you have a table of users with a
"friends" column. Say user 1 is friends with users 9, 7, 24 and 6; then
would it be possible to store this
array/list/whatever-you-want-to-call-it into user 1's friends cell and
extract the join in a query, some how? An easy and obvious way, that
is; I can think of string processing hacks ;)

Thanks :)
Christopher Harrison

Oct 25 '05 #1

Subscribe Reply

2641

Gordon Burditt

>Is there a way to store an indefinite number of keys in one field and

self join them?
Under the right circumstances, group_concat() might do that.
Imagine, for example, you have a table of users with a
"friends" column. Say user 1 is friends with users 9, 7, 24 and 6; then
would it be possible to store this
array/list/whatever-you-want-to-call-it into user 1's friends cell and
extract the join in a query, some how?
I can't imagine why you'd want to STORE the info that way. It makes
it very inconvenient to access. I can see why you might want to
display it that way if the screen and/or paper were wide enough to
accomodate the person with the most friends.
An easy and obvious way, that
is; I can think of string processing hacks ;)

Gordon L. Burditt

Oct 25 '05 #2

Christopher Harrison

It came to me, about an hour after I posted my message, a reasonable
way to do this. I'd just like to ask people's opinion on my technique
(I'm kind of a newbie ;)

Basically, each user doesn't have a "friends" column anymore, they have
their own table (e.g. "friends_of_use r123") that lists their friends'
ids from the users table. One could make the user's friends' ids the
primary key to avoid duplicates.
The bonus to this method is that the table could be used to store other
information (e.g. the user's relationship with each friend: best
friend; casual acquaintence; etc.). These friends tables could be
generated easily with an SQL script. The problem lies in having a great
heap of tables, as each user is going to require one of their own...
Although it would work, it seems to me like there's a great deal of
redundancy that should be simplifed.

Any thoughts?

Christopher Harrison
Gordon Burditt wrote:

Is there a way to store an indefinite number of keys in one field and
self join them?

Under the right circumstances, group_concat() might do that.
Imagine, for example, you have a table of users with a
"friends" column. Say user 1 is friends with users 9, 7, 24 and 6; then
would it be possible to store this
array/list/whatever-you-want-to-call-it into user 1's friends cell and
extract the join in a query, some how?

I can't imagine why you'd want to STORE the info that way. It makes
it very inconvenient to access. I can see why you might want to
display it that way if the screen and/or paper were wide enough to
accomodate the person with the most friends.
An easy and obvious way, that
is; I can think of string processing hacks ;)

Gordon L. Burditt

Oct 26 '05 #3

Gordon Burditt

>It came to me, about an hour after I posted my message, a reasonable

way to do this. I'd just like to ask people's opinion on my technique
(I'm kind of a newbie ;)

Basically, each user doesn't have a "friends" column anymore, they have
their own table (e.g. "friends_of_use r123") that lists their friends'
ids from the users table. One could make the user's friends' ids the
primary key to avoid duplicates.
Naming tables based on a field is almost always a bad idea. Consider
adding a column, and put the table name, or the variable part of
it, into THAT. Now throw all the rows into a single table. E.g.
you have a table "friends", with columns "user" and "friend". Add
a unique index on (user,friend).

The one exception I can think of to the above rule is archival data.
Say, you've got different tables for transactions in different
years. Normally, I'd say put them all in one table, *BUT* if the
idea for separating them is to remove the old transactions from the
active database to CD-ROM or tape or something because you don't
have enough storage for 80 years of transactions, that's a reasonable
approach.

This kind of "link" table is common for a many-to-many relationship.
(e.g. A is a DVD title owned by B). If A is related to B, you put
in a row for (A,B), using the primary keys of the other tables
involved. You can have many rows with A in the left column, and
many rows with A in the right column. Often there is the restriction
that you may only have (A,B) in there ONCE (there's no "A is a
triple-friend of B"), so you have a unique index on the pair of
columns. You get the list of A's friends with select friend from
friends where user = 'A'. You can also get a list of the people
who consider B a friend with select user from friends where friend
= 'B'.

I'm assuming here that you are not trying to represent a two-way
relationship here. In other words, if A considers B to be a friend,
it doesn't necessarily mean that B considers A to be a friend, but
you're insisting on it anyway. Sometimes that holds: if A is
married to B, normally B is also married to A (anyone know of real
exceptions to this?). However A may *THINK* he is married to B,
but B is a trigamist and A isn't her first husband. But don't count
on being able to label the columns "husband" and "wife". In some
states that doesn't hold.

All of the ways mentioned (all crammed into one field, lists in
different tables, link table listing each pair of people and friend)
don't deal with an assumption "if A is a friend of B, then B must
be a friend of A" well, and it takes extra code to enforce that if
you insist on doing so. It also makes it hard to determine who can
control the friends lists, and if A is stalking B, you may have to
deal with court orders from B to remove "B is a friend of A" and
not let A put it back.
The bonus to this method is that the table could be used to store other
information (e.g. the user's relationship with each friend: best
friend; casual acquaintence; etc.).
"link" tables don't usually have this extra info but it is an appropriate
place to put it in some circumstances.
These friends tables could be
generated easily with an SQL script. The problem lies in having a great
heap of tables, as each user is going to require one of their own... Although it would work, it seems to me like there's a great deal of
redundancy that should be simplifed.

One "link" table is, I think, a much better solution. It also allows
queries like "how many people have B as a friend"? which the
variable-table-name approach doesn't (at least not easily).

Gordon L. Burditt

Oct 26 '05 #4

Christopher Harrison

> Naming tables based on a field is almost always a bad idea. Consider

adding a column, and put the table name, or the variable part of
it, into THAT. Now throw all the rows into a single table. E.g.
you have a table "friends", with columns "user" and "friend". Add
a unique index on (user,friend).

Ahh, that might be a better idea ;) I was thinking that, like you say,
naming tables after fields is a bit dodgy... So if I set multiple
fields as the primary keys, data in one of the fields doesn't
necessarily need to be unique, just so long as, combined with the other
fields, it is a unique tuple? Thanks for that :)
Fortunately I'm not trying to represent a two-way relationship; well,
it can be either/or. I realise it bloats things somewhat, but in that
event, we can just have (A,B) and (B,A) as records. I guess we could
have an extra boolean field for two-way relationships; but then, if I
had my status field, the two friends would have to consider themselves
equal partners (i.e. both good friends, etc.), which might not be
realistic.

Thanks very much :)
Christopher Harrison

P.S. I realise it's a bit contrived, but whilst you can be married to a
mafia don (and so, vicariously, married to the mob), it doesn't really
make sense for the mob to be married to you ;)

Oct 26 '05 #5

Gordon Burditt

>> Naming tables based on a field is almost always a bad idea. Consider

adding a column, and put the table name, or the variable part of
it, into THAT. Now throw all the rows into a single table. E.g.
you have a table "friends", with columns "user" and "friend". Add
a unique index on (user,friend).
Ahh, that might be a better idea ;) I was thinking that, like you say,
naming tables after fields is a bit dodgy... So if I set multiple
fields as the primary keys, data in one of the fields doesn't
necessarily need to be unique, just so long as, combined with the other
fields, it is a unique tuple? Thanks for that :)

Yep. I do that all the time, for instance, in activity logs, which
record stuff like hourly mail traffic through various hosts back a
few years. The primary key is likely to be the fields (timestamp,
host, statistic) where there are plenty (hundreds) of identical
timestamps (usually a multiple of 1 hour), plenty (many thousands)
of records for the same host, and plenty (millions) of records for
the same statistic (mail received, mail delivered, viruses rejected,
mail bounced, etc. There's a big enum for all the statistics names)
but only one count for any combination of the three.
Fortunately I'm not trying to represent a two-way relationship; well,
it can be either/or. I realise it bloats things somewhat, but in that
event, we can just have (A,B) and (B,A) as records. I guess we could
have an extra boolean field for two-way relationships; but then, if I
had my status field, the two friends would have to consider themselves
equal partners (i.e. both good friends, etc.), which might not be
realistic.
Think about the query needed to determine if A is related to B in
a two-way setup, leaving aside any extra data going into the link
table, like how serious/type of relationship. The extra data makes
it messier and putting in two records, one for each way, looks more
and more attractive. Even without the extra data, spending the
table space for performance and less complexity may be a big win.

SELECT count(*) FROM friends where user = 'A' and friend = 'B';

will return 1 if A is a friend of B, and 0 if not. If you're trying to
represent a two-way relationship, you need something like:

SELECT count(*) FROM friends where (user = 'A' and friend = 'B') or
(friend = 'A' and user = 'B' and two_way_flag = 'Y')

which is less efficient. It also might manage to return 2 (the
primary key constraint won't prevent (A,B) and (B,A) both existing.)

Adequately updating this also requires some effort (and locking)
when a user decides to delete a friend. You might have to turn a
two-way relationship into a one-way one.

Trying to do joins between a user and his friends (the user table
gets joined twice, once for the user, once for the friend, and the
friends table gets joined once) to determine, say, the distribution
of age differences between users and their friends gets MUCH messier
with the two-way setup. You can't join on variable column names
or variable table names.

select (u1.birthyear - u2.birthyear) as diff, count(u1.birthy ear)
from user u1, user u2, friend f where u1.id = f.user and
u2.id = f.friend group by diff;
P.S. I realise it's a bit contrived, but whilst you can be married to a
mafia don (and so, vicariously, married to the mob), it doesn't really
make sense for the mob to be married to you ;)

That's not the definition of marriage that Texas uses (Texas has a
constitutional amendment on the ballot defining marriage as a union
of one man and one woman. I would prefer to see marriage defined
as the union of two social security numbers on a Form 1040, or
better, define marriage as whatever your church says and leave the
state out of it completely so, for instance, that "marital status"
box on tax forms has to go away.)

You might actually end up with A married to B but not B married to
A if A and B are living in different states (I believe there are
some contrived situations where the state line runs through the
marriage bed) with different laws regarding recognition of homosexual
marriage.

Gordon L. Burditt

Oct 26 '05 #6

Rich Ryan

Hey Christopher,

Yes, you could do that, but why?

Why not just create a simple table like:

CREATE TABLE UserFriends(
UserID int,
FriendUserID int)

"Christophe r Harrison" <Sp*********@Ch risHarrison.co. uk> wrote in message
news:11******** **************@ g44g2000cwa.goo glegroups.com.. .

Is there a way to store an indefinite number of keys in one field and
self join them? Imagine, for example, you have a table of users with a
"friends" column. Say user 1 is friends with users 9, 7, 24 and 6; then
would it be possible to store this
array/list/whatever-you-want-to-call-it into user 1's friends cell and
extract the join in a query, some how? An easy and obvious way, that
is; I can think of string processing hacks ;)

Thanks :)
Christopher Harrison

Oct 26 '05 #7

Similar topics

3456

complex mysql 3.23 LEFT JOIN question

by: Paul Bramscher | last post by:

Here's one for pathological SQL programmers. I've got a table of things called elements. They're components, sort of like amino acids, which come together to form complex web pages -- as nodes in trees which form parent-child relationships, sort of like newsgroups. For example, the parent_id field points to another element. Indent_level is there for denormalization purposes, to avoid costly recursive issues in querying. The...

MySQL Database

11195

Join across servers

by: steve | last post by:

Can you join two tables across different servers in mySQL, or am I dreaming of Oracle? -- Posted using the http://www.dbforumz.com interface, at author's request Articles individually checked for conformance to usenet standards Topic URL: http://www.dbforumz.com/mySQL-Join-servers-ftopict225285.html Visit Topic URL to contact author (reg. req'd). Report abuse: http://www.dbforumz.com/eform.php?p=777346

MySQL Database

9226

Stored procedure deadlock on self with subquery

by: Andrew Mayo | last post by:

Here's a really weird one for any SQL Server gurus out there... We have observed (SQL Server 2000) scenarios where a stored procedure which (a) begins a transaction (b) inserts some rows into a table (c) re-queries another table using a subquery which references the inserted table (correlated or not)

Microsoft SQL Server

4107

complexity of nested self left joins?

by: jbm05 | last post by:

Hi, I'm curious about the computational complexity of a query I have. The query contains multiple nested self left joins, starting with a simple select, then doing a self left join with the results, then doing a self left join with those results, etc. What puzzles me is that the time required for the query seems to grow exponentially as I add additional left joins, which I didn't expect. I expected the inner select to return about 25...

Microsoft SQL Server

3185

when and how do you use Self?

by: Tieche Bruce A MSgt USMTM/AFD | last post by:

I am new to python, Could someone explain (in English) how and when to use self? I have been reading, and haven't found a good example/explanation

Python

8630

Self-Referencing tables

by: John E | last post by:

I have a question about how to make records in a table reference other records in the same table.... Suppose I'm building a small database to track some basic information on several companies. In addition to the ID and other info, I want to track predecessor and succesor companies, if any. In other words, a given company may have come into being from the merger of two or more predecessors, may have been spun-off from a single...

Microsoft Access / VBA

1523

Mammoth amounts of self joins

by: sks | last post by:

I have a table that contains keywords (Varchars) each one mapped to a product. so the database schema is just an id column, product column and keyword column. I want to select the products that contain x,y,z keywords. Now if this query involves many keywords I end up with a massive amount of self joins on the keywords table, is there a better way to do this than self joins? What would work perhaps is multiple unions where I could discard...

MySQL Database

6345

Can inner join replace multiple loop selects?

by: MP | last post by:

Hi trying to begin to learn database using vb6, ado/adox, mdb format, sql (not using access...just mdb format via ado) i need to group the values of multiple fields - get their possible variations(combination of fields), - then act on each group in some way ...eg ProcessRs (oRs as RecordSet)... the following query will get me the distinct groups

Microsoft Access / VBA

1788

Self Joins in Access - Error Encountered.

by: Darragh | last post by:

Hi all! I'm having a bit of an issue making a self join in access (similar to the example explained on Allen Browne's excellent site - http://allenbrowne.com/ser-06.html). I've made the self join in the query window, however when I've tried to run the query i get the message "table t1 opened exclusively by another user, or it is already open through the user interface and cannot be manipulated programmatically"

Microsoft Access / VBA

9462

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...

General

9287

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...

Windows Server

9886

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...

Online Marketing

8723

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...

Career Advice

7259

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...

Microsoft Access / VBA

6542

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...

C# / C Sharp

5155

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...

Networking - Hardware / Configuration

3817

transfer the data from one system to another through ip address

by: 6302768590 | last post by:

Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

C# / C Sharp

2677

Comprehensive Guide to Website Development in Toronto: Expert Insights from BSMN Consultancy

by: bsmnconsultancy | last post by:

In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

General