473,761 Members | 2,410 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Cartesian product bug?

Hi,

We have found a possible bug in 7.3.1. It seems that using CROSS JOIN and
doing plain Cartesian product, listing to tables in the from clause, gives
different results. According to the documentation this should be
equivalent. The following example should explain the problem:

CREATE TABLE a (a1 text, a2 text);
CREATE TABLE b (b1 text, b2 text);
CREATE TABLE c (a1 text, b1 text, c1 text);

INSERT INTO a VALUES('a1', 'a2');
INSERT INTO b VALUES('b1', 'b2');
INSERT INTO c VALUES('a3', 'b1', 'c1');

SELECT * FROM a,b NATURAL JOIN c;
a1 | a2 | b1 | b2 | a1 | c1
----+----+----+----+----+----
a1 | a2 | b1 | b2 | a3 | c1
(1 row)

SELECT * FROM a CROSS JOIN b NATURAL JOIN c;
a1 | b1 | a2 | b2 | c1
----+----+----+----+----
(0 rows)

These two example queries should give the same result. In the first query,
it seems like it’s doing the natural join between b and c, and then does
the Cartesian product on that result with a. On the second query, it does
as we assume it should, namely does the Cartesian product first.

Is this the correct behavior?

Regards

Ã…smund
--
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Nov 12 '05 #1
5 2563
Åsmund Kveim Lie <as******@skipt his.ifi.uio.no> writes:
SELECT * FROM a,b NATURAL JOIN c;
This parses as

select * from a, (b natural join c)
SELECT * FROM a CROSS JOIN b NATURAL JOIN c;
This parses as

select * from (a cross join b) natural join c
These two example queries should give the same result. In the first query, it
seems like it’s doing the natural join between b and c, and then does
the Cartesian product on that result with a. On the second query, it does as
we assume it should, namely does the Cartesian product first.

Is this the correct behavior?


yes

You can put parentheses to change the explicit joins like

select * from a cross join (b natural join c);

But the implicit join is harder to fix. I think you either need to use an
explicit join like above or a subquery like

select * from (select * from a,b) as ab natural join c

I tend to find it's easier to stick to all explicit or all implicit joins and
not mix them. Personally I like explicit joins for aesthetic reasons
especially in 7.4 where they get optimized as well as implicit joins.
--
greg
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postg resql.org so that your
message can get through to the mailing list cleanly

Nov 12 '05 #2
=?utf-8?Q?=C3=85smund _Kveim_Lie?= <as******@skipt his.ifi.uio.no> writes:
SELECT * FROM a,b NATURAL JOIN c; SELECT * FROM a CROSS JOIN b NATURAL JOIN c; These two example queries should give the same result.
No, they shouldn't, because JOIN binds more tightly than comma. The
first is equivalent to

SELECT * FROM a CROSS JOIN (b NATURAL JOIN c);

while in the second case the JOINs associate left-to-right, giving

SELECT * FROM (a CROSS JOIN b) NATURAL JOIN c;

Because you have columns with the same names in A and C, the second
NATURAL JOIN has a different implicit join clause than the first.

(Personally I think NATURAL JOIN is an evil, bug-prone construct,
precisely because coincidental matches of column names will mess up your
results.)
In the first query, it seems like it’s doing the natural
join between b and c, and then does the Cartesian product on that
result with a. On the second query, it does as we assume it should,
namely does the Cartesian product first.


I think your expectations have been set by MySQL, which last I heard
interprets all joins as being done left-to-right. That's not compliant
with the SQL standard, however.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddres sHere" to ma*******@postg resql.org)

Nov 12 '05 #3
On Fri, 31 Oct 2003, Tom Lane wrote:
(Personally I think NATURAL JOIN is an evil, bug-prone construct,
precisely because coincidental matches of column names will mess up your
results.)


Me too. When I first saw it, I figured it would "naturally join" the two
tables on their fk/pk relation if there was one. That seems natural.
Joining on two fields that just happen to have the same name is unnatural
to me.
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 12 '05 #4

"scott.marl owe" <sc***********@ ihs.com> writes:
On Fri, 31 Oct 2003, Tom Lane wrote:
(Personally I think NATURAL JOIN is an evil, bug-prone construct,
precisely because coincidental matches of column names will mess up your
results.)


Me too. When I first saw it, I figured it would "naturally join" the two
tables on their fk/pk relation if there was one. That seems natural.
Joining on two fields that just happen to have the same name is unnatural
to me.


Well 99% of the time I impose on myself a constraint to only use the same name
iff they refer to the same attribute. So if they have the same name then they
really ought to be a reasonable join clause.

However the 1% are things like "date_creat ed, date_updated" or even flags like
"active", "deleted" etc. Which are more than enough to make it utterly
useless.

Too bad really, it would be a handy thing for ad-hoc queries typed at psql. It
would still seem too fragile for production queries though.

--
greg
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postg resql.org

Nov 12 '05 #5
Greg Stark wrote:

"scott.marl owe" <sc***********@ ihs.com> writes:
On Fri, 31 Oct 2003, Tom Lane wrote:
(Personally I think NATURAL JOIN is an evil, bug-prone construct,
precisely because coincidental matches of column names will mess up your
results.)


Me too. When I first saw it, I figured it would "naturally join" the two
tables on their fk/pk relation if there was one. That seems natural.
Joining on two fields that just happen to have the same name is unnatural
to me.


Well 99% of the time I impose on myself a constraint to only use the same name
iff they refer to the same attribute. So if they have the same name then they
really ought to be a reasonable join clause.

However the 1% are things like "date_creat ed, date_updated" or even flags like
"active", "deleted" etc. Which are more than enough to make it utterly
useless.

Too bad really, it would be a handy thing for ad-hoc queries typed at psql. It
would still seem too fragile for production queries though.


I think the reason they don't use pk/fk in natural joins is because you can
join all sorts of results, like SELECT in FROM, that doesn't always have
a meaningful pk/fk.

--
Bruce Momjian | http://candle.pha.pa.us
pg***@candle.ph a.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 12 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
2810
by: deancoo | last post by:
I need to do a Cartesian product, which is inherently expensive. Turns out, it's too expensive. I've dropped in that portion of my C++ code in hopes that someone with greater expertise with STL containers and algorithms would be able to see if there are any significant inefficiencies in what I've done. Otherwise, I'm going to have to rethink my solution, which I really would like to avoid. Thanks for any help. d // initialize...
7
3099
by: Eric Slan | last post by:
Hello All: I'm having a problem that's been baffling me for a few days and I seek counsel here. I have an Access 2000 DB from which I want to run several reports. These reports are essentially the same (albeit for minor formatting differences). To produce these reports, I am drawing from three tables:
4
3560
by: John Smith | last post by:
Isn't life a bitch! You know what you want but you don't know how to get it. I have produced 12 queries that calculate a payment profile over 12 months. For a number of the records (ie with the same product id)there are likely to be more than one payment recorded. Nonetheless, when I run the queries individually and the query relates to a table that provides referential integrity through a one to many relationship, the query does what...
44
4190
by: Christoph Zwerschke | last post by:
In Python, it is possible to multiply a string with a number: >>> "hello"*3 'hellohellohello' However, you can't multiply a string with another string: >>> 'hello'*'world' Traceback (most recent call last): File "<interactive input>", line 1, in ?
78
4628
by: wkehowski | last post by:
The python code below generates a cartesian product subject to any logical combination of wildcard exclusions. For example, suppose I want to generate a cartesian product S^n, n>=3, of that excludes '*a*b*' and '*c*d*a*'. See below for details. CHALLENGE: generate an equivalent in ruby, lisp, haskell, ocaml, or in a CAS like maple or mathematica. #------------------------------------------------------------------------------- # Short...
2
7338
by: zfareed | last post by:
I have a program that creates two sets, one thru user interaction and the other with the use of an array. Can anyone help with coding for finding the cartesian product of the two sets; i.e a relation? #include <iostream> #include <set> #include <algorithm> using namespace std;
5
12580
by: thelightkeeper | last post by:
Hi, I have 1 table contains about 4 millions entries structure like below: ( AlarmID int, SetTime datetime )
0
9522
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10111
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
9902
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9765
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8770
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7327
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5364
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3866
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
3446
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.