473,569 Members | 2,422 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Join works in 7.3.6, fails in 7.4.2

I have a query that works in 7.3.6 but not in 7.4.2 unless I turn
off enable_hashjoin . I'm joining a table of network interfaces and
a table of networks so I can find additional info about a particular
interface's network. To speed up the join, I'm indexing the
interface IP addresses using a function that converts the IP address
to its network address; this way the join doesn't have to scan using
the << or >> operator.

Here's a reduced example of what I'm doing:

CREATE FUNCTION inet2net (INET) RETURNS INET AS '
SELECT NETWORK(SET_MAS KLEN($1, 24));
' LANGUAGE SQL IMMUTABLE;

CREATE TABLE ipinterface (
ifid INTEGER NOT NULL PRIMARY KEY,
ifaddr INET NOT NULL
);

CREATE INDEX ipinterface_ifa ddr_idx ON ipinterface (ifaddr);
CREATE INDEX ipinterface_ifa ddrnet_idx ON ipinterface (inet2net(ifadd r));

CREATE TABLE ipnet (
netid INTEGER NOT NULL PRIMARY KEY,
netaddr INET NOT NULL,
CONSTRAINT uniq_netaddr UNIQUE (netaddr)
);

CREATE INDEX ipnet_netaddr_i dx ON ipnet (netaddr);

After populating the tables, I ran VACUUM ANALYZE on both of them,
so the planner's statistics should be current.

Here's a query that illustrates the problem:

SELECT ifid, ifaddr, netid, netaddr
FROM ipinterface AS i
JOIN ipnet AS n ON (inet2net(i.ifa ddr) = n.netaddr)
WHERE netid IN (10, 20);

From my sample data set (available upon request), this query returns
24 rows in 7.3.6, which is correct. Here's the 7.3.6 EXPLAIN ANALZYE:

Nested Loop (cost=0.00..533 .78 rows=24 width=32) (actual time=0.20..0.37 rows=24 loops=1)
-> Index Scan using ipnet_pkey, ipnet_pkey on ipnet n (cost=0.00..6.0 3 rows=2 width=16) (actual time=0.11..0.12 rows=2 loops=1)
Index Cond: ((netid = 10) OR (netid = 20))
-> Index Scan using ipinterface_ifa ddrnet_idx on ipinterface i (cost=0.00..262 .58 rows=92 width=16) (actual time=0.06..0.10 rows=12 loops=2)
Index Cond: (inet2net(i.ifa ddr) = "outer".netaddr )
Total runtime: 0.52 msec
(6 rows)

The same query in 7.4.2 returns no results. Here's its plan:

Hash Join (cost=6.04..483 .92 rows=24 width=30) (actual time=299.948..2 99.948 rows=0 loops=1)
Hash Cond: (network(set_ma sklen("outer".i faddr, 24)) = "inner".netaddr )
-> Seq Scan on ipinterface i (cost=0.00..293 .32 rows=18432 width=15) (actual time=0.039..130 .604 rows=18432 loops=1)
-> Hash (cost=6.03..6.0 3 rows=2 width=15) (actual time=0.257..0.2 57 rows=0 loops=1)
-> Index Scan using ipnet_pkey, ipnet_pkey on ipnet n (cost=0.00..6.0 3 rows=2 width=15) (actual time=0.142..0.1 96 rows=2 loops=1)
Index Cond: ((netid = 10) OR (netid = 20))
Total runtime: 300.775 ms
(7 rows)

If I turn off enable_hashjoin in 7.4.2 I get 24 rows, as expected:

Nested Loop (cost=0.00..534 .87 rows=24 width=30) (actual time=0.301..1.0 94 rows=24 loops=1)
-> Index Scan using ipnet_pkey, ipnet_pkey on ipnet n (cost=0.00..6.0 3 rows=2 width=15) (actual time=0.132..0.1 80 rows=2 loops=1)
Index Cond: ((netid = 10) OR (netid = 20))
-> Index Scan using ipinterface_ifa ddrnet_idx on ipinterface i (cost=0.00..262 .81 rows=92 width=15) (actual time=0.088..0.2 42 rows=12 loops=2)
Index Cond: (network(set_ma sklen(i.ifaddr, 24)) = "outer".netaddr )
Total runtime: 1.914 ms
(6 rows)

Am I doing something wrong, or should I report this to the bugs
list?

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #1
4 1386
Michael Fuhr <mi**@fuhr.or g> writes:
I have a query that works in 7.3.6 but not in 7.4.2 unless I turn
off enable_hashjoin . I'm joining a table of network interfaces and
a table of networks so I can find additional info about a particular
interface's network.
Hmm. The inet = operator is marked hashable in 7.4 but not in 7.3 ...
I wonder if that is a mistake? I recall looking at the datatype and
deciding there were no insignificant bits in it, but that could be
wrong. Or it could be that the network() function is taking some
shortcut it shouldn't.

Is any of this data IPv6 addresses by any chance?
From my sample data set (available upon request),


Could we see the specific values that join in 7.3 and fail to do so in
7.4?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postg resql.org

Nov 23 '05 #2
I wrote:
Michael Fuhr <mi**@fuhr.or g> writes:
I have a query that works in 7.3.6 but not in 7.4.2 unless I turn
off enable_hashjoin . I'm joining a table of network interfaces and
a table of networks so I can find additional info about a particular
interface's network.
Hmm. The inet = operator is marked hashable in 7.4 but not in 7.3 ...
I wonder if that is a mistake?


Digging further, I find that indeed this seems to be a mistake. CIDR
and INET values that have the same address and masklen compare as equal
according to network_eq(), but they will not hash the same because
there's a flag identifying whether a given value is considered CIDR or
INET. And what the network() function returns is marked as a CIDR.
It's a bit surprising that your hash join produces any matches at all...

I believe I got misled on this because there is a hash index operator
class for inet; at one point during the 7.4 cycle I went around and
cleaned up cases where the equality operator's canhash flag was
inconsistent with the set of hash index opclasses. Arguably the hash
opclass is broken, although in practice people probably don't notice the
failure since a given column is likely to contain either all inet or all
cidr values. (And of course it's entirely likely that there *aren't*
any people using the inet hash opclass, period...)

I can think of a number of possible fixes:

1. Mark inet = as not hashjoinable. We'd probably want to remove the
inet hash opclass too.

2. Redefine inet = so that CIDR and INET values are never considered
equal, thus eliminating the unused field. This could be back-patched
into 7.4 but otherwise seems to have little to recommend it. It
would certainly not help solve Michael's problem.

3. Provide a specialized hash method for type inet that ignores the
iptype field.

#3 seems the most desirable going forward, but is probably impractical
to back-patch into 7.4.*, so I'm not sure what to do about the problem
in that branch. Given the relatively low incidence of the problem,
maybe it's okay to just clear the oprcanhash flag in future 7.4.*
releases. This would not fix the problem for existing installations
(unless they initdb) but any complainers could be told how to adjust
their catalogs manually.

Can anyone think of any other approaches?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #3
On Tue, Apr 13, 2004 at 03:42:54PM -0400, Tom Lane wrote:
Michael Fuhr <mi**@fuhr.or g> writes:
I have a query that works in 7.3.6 but not in 7.4.2 unless I turn
off enable_hashjoin . I'm joining a table of network interfaces and
a table of networks so I can find additional info about a particular
interface's network.
Hmm. The inet = operator is marked hashable in 7.4 but not in 7.3 ...
I wonder if that is a mistake? I recall looking at the datatype and
deciding there were no insignificant bits in it, but that could be
wrong. Or it could be that the network() function is taking some
shortcut it shouldn't.


So would a workaround be to set oprcanhash to false for that
operator? I did the following and it appeared to solve the
problem:

UPDATE pg_operator SET oprcanhash = FALSE WHERE oid = 1201;

Or, without knowing that 1201 is the correct OID:

UPDATE pg_operator SET oprcanhash = FALSE
WHERE oprname = '='
AND oprleft IN (SELECT oid FROM pg_type WHERE typname = 'inet');
Is any of this data IPv6 addresses by any chance?


Nope -- all IPv4.
From my sample data set (available upon request),


Could we see the specific values that join in 7.3 and fail to do so in
7.4?


I can duplicate the problem with the following data:

INSERT INTO ipinterface VALUES (1, '10.0.1.1');
INSERT INTO ipinterface VALUES (2, '10.0.2.1');
INSERT INTO ipnet VALUES (10, '10.0.1.0/24');
INSERT INTO ipnet VALUES (20, '10.0.2.0/24');

Thanks for looking into this.

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 23 '05 #4
Michael Fuhr <mi**@fuhr.or g> writes:
So would a workaround be to set oprcanhash to false for that
operator?


Right, see my followup note. This may in fact be the only workable
solution for the 7.4.* series.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postg resql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
5284
by: Ray | last post by:
Group, Passing inline values to a udf is straightforward. However, how or is it possible to pass a column from the select of one table into a udf that returns a table variable in a join to the original table. The goal is to explode the number of rows in the result set out based on the result of the udf. Although the example I am...
5
8203
by: Todd | last post by:
Data related to the query I'm working on is structured such that TableA and TableB are 1-many(optional). If an item on TableA has children on TableB, I need to use the Max(tstamp) from Table B in a condition, otherwise I need to use a tstamp from TableA (note:there are additional tables and conditions for this query, but this problem is...
0
2396
by: pj | last post by:
/* Make two tables, then find that a left join between them works, unless you add a view on top of one table -- if you add a view and use it, the left join fails -- rather, it acts as an inner join */ aaTable1 myid, Number typeid, Number 1,11 1,13
2
1465
by: Bob Stearns | last post by:
I only want to do second join if the first was unsuccessful and was wondering if something like the following would be valid: select t1.*, t2.*, t3.* from t1 left outer join table (select ta.*, tb.uniquename from ta left outer join tb on tb.key=ta.key where ta.key=t1.key) t2 on 0=0
6
1898
by: GarryJones | last post by:
I think the following statement .... $ml_collect='SELECT *, DATE(CONCAT(field1, field2)) AS thedate FROM ml_lopp LEFT JOIN scfmforening ON (scfmforening.scfmnum = ml_lopp.scfmnum) LEFT JOIN ml_tidplats ON (ml_tidplats.loppnum = ml_lopp.loppnum) ORDER BY thedate'; .....would work if "field1" and "field2" were in the table "ml_lopp" ...
5
32734
by: jim | last post by:
Hi, I've browsed several posts, but still haven't found the answer I'm looking for. I have one table (A) that contains a list of values I want to return. I have two other tables (B) and (C) that may or may not include the same values, but will have others I need to include. I'd like to return all of the data from table A with a yes/no...
5
2305
code green
by: code green | last post by:
I cannot get the following query correct. Please can somebody put me right SELECT , FROM INNER JOIN ON . = . INNER JOIN ON . = . INNER JOIN invoice ON . = . INNER JOIN deliver ON . = . WHERE . > 88950 ** Throws the error Syntax error Missing Operator **I have also tried the "nested" idea but this also fails SELECT * FROM INNER...
1
3293
by: Chrace | last post by:
Hi all, I have a problem with with Thread.Join( Timeout ) where the timeout never occurs. I basically need to make a connection to an AS400 box which works fine. Once in a blue moon the AS400 gets a problem and the way this is handled on AS400 is by hanging. If I was to connect directly this would mean my main process would hang as well, so...
1
2643
by: nkarkhan | last post by:
Hello, I have a list of strings, some of the strings might be unicode. I am trying to a .join operation on the list and the .join raises a unicode exception. I am looking for ways to get around this. I would like to get a unicode string out of the list with all string elements seperated by '\n' #!/usr/bin/env python import sys import...
0
7703
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7618
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
7926
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
0
7982
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
1
5514
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
5222
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3656
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
1
2116
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
0
944
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.