By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
439,971 Members | 1,451 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 439,971 IT Pros & Developers. It's quick & easy.

gist index build produces corrupt result on first access totable.

P: n/a
We've implemented a 5D box data type and have implemented both RTree and
GiST access methods under PostgresSQL 7.4 and PostgresSQL 7.4.1. The 5D box
internally looks like:
struct Box5D{
float minBounds[5];
float maxBounds[5];
};
and so takes up 40 bytes and is of fixed length. The GiST access methods
are basically a generalization of the 2D box indexing access methods from
Refraction Research's PostGIS.

We've tested this by building an index on a table that contains 30000 rows
(no nulls) representing small non-overlapping boxes on a 2D plane (ie, all
the values for indices 0 and 1 vary; the values for indices 2,3, and 4 are
constant). Then we try a select on the table with an overlaps test in the
where clause that should result in 66 rows being returned.
We've used an "explain select ..... " to verify that the index is used
during the search.

An RTree index built on the table works fine, always. The ability to build
a GiST index is less reliable; sometimes the index is built properly and
subsequent searchs on it work perfectly. Sometimes the index is built
incorrectly, and subsequent searchs on it fail. We have examined the keys
passed to our GiST consistent access method during the select and have
observed that when the index hasn't been built properly, the first key
passed to the consistent method is gibberish (strange values for indices 0
and 1, strange nonzero values for indices 2,3 and 4). When the index has
been built properly, the first key passed to the consistent method has
reasonable looking values.

We've been able to isolate this behavior somewhat: the index build only
seems to produce corrupt results if the data connection that sent the
command did not perform any preceding statements that would have caused the
boxes to have been read from disk. For example, the sequence:
<open connection to db>
create index box_index on box_table using gist(box_column);

would produce a corrupt index.
However, the sequence

<open connection to db>
select count(*) from box_table where box_column && '0 0 0 0 0 1 1 1 1 1';
create index box_index on box_table using gist(box_column);

produces a valid index,
as does the sequence:

<open connection to db>
create index box_index on box_table using gist(box_column);
drop index box_index;
create index box_index on box_table using gist(box_column);

In short, it appears that an operation that forces the boxes to read from a
table
needs to be done to "prime the index building pump".

We've tried installing PostGIS and it does not seem to be affected by any
similar problems.
The notable differences between our code and Refraction's are:
* our input geometries are fixed length, rather than variable length
* our compress method just returns the GISTENTRY * it was passed as an
argument. We've tried returning a deep copy of the GISENTRY *, but that
made no difference.
We've looked at the contrib/cube code, but it's also a variable length
parameter, and it uses a very different parameter passing mechanism.
In the short term, we can work around the issue, but it would be nice to be
able to fix the problem. Any ideas are welcome.

Thank you.
Eric.

**********************************************
Eric Davies, M.Sc.
Barrodale Computing Services Ltd.
Tel: (250) 472-4372 Fax: (250) 472-4373
Web: http://www.barrodale.com
Email: er**@barrodale.com
**********************************************
Mailing Address:
P.O. Box 3075 STN CSC
Victoria BC Canada V8W 3W2

Shipping Address:
Hut R, McKenzie Avenue
University of Victoria
Victoria BC Canada V8W 3W2
**********************************************

Nov 23 '05 #1
Share this Question
Share on Google+
3 Replies


P: n/a
Eric Davies <er**@barrodale.com> writes:
We've looked at the contrib/cube code, but it's also a variable length
parameter, and it uses a very different parameter passing mechanism.


Different how? I have a hard-to-pin-down intuition that your problem is
closely associated with this issue, but without details it's impossible
to say more. IIRC there were once assumptions in rtree and/or gist
about the indexable datatype being pass-by-ref and/or varlena, and I'm
not sure how much of that has been fixed.

This seems a bit far afield for pgsql-general, however. I would
suggest offering more details on -hackers. If you could post your code
for people to play with, you'd likely get answers pretty quickly.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 23 '05 #2

P: n/a


Tom Lane wrote:
Eric Davies <er**@barrodale.com> writes:
We've looked at the contrib/cube code, but it's also a variable length
parameter, and it uses a very different parameter passing mechanism.

Different how? I have a hard-to-pin-down intuition that your problem is
closely associated with this issue, but without details it's impossible
to say more. IIRC there were once assumptions in rtree and/or gist
about the indexable datatype being pass-by-ref and/or varlena, and I'm
not sure how much of that has been fixed.

contribs rtree_gist, btree_gist and seg uses pass-by-ref fixed length value for
storage, and there isn't such problem.
It seems to me, the problem may be in union or picksplit methods, but I needs
more info.

I advice to play with gevel module (
http://www.sai.msu.su/~megera/postgr...l/README.gevel,
http://www.sai.msu.su/~megera/postgr...l/gevel.tar.gz
)

--
Teodor Sigaev E-mail: te****@sigaev.ru

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 23 '05 #3

P: n/a

At 10:28 PM 3/3/2004, Tom Lane wrote:
Eric Davies <er**@barrodale.com> writes:
We've looked at the contrib/cube code, but it's also a variable length
parameter, and it uses a very different parameter passing mechanism.
Different how? I have a hard-to-pin-down intuition that your problem is


the parameter passing mechanism used by Postgis (and our 5D box) is to
pass a FunctionCallInfoData structure, via a PG_FUNCTION_ARGS macro.
The Postgis penalty function starts with:
/*
** The GiST Penalty method for boxes
** As in the R-tree paper, we use change in area as our penalty metric
*/
PG_FUNCTION_INFO_V1(gbox_penalty);
Datum gbox_penalty(PG_FUNCTION_ARGS)
{
GISTENTRY *origentry = (GISTENTRY *) PG_GETARG_POINTER(0);

the cube code appears to pass each argument separately, as one would
normally do with a
C program. Ie, the g_cube_penalty function starts with:
/*
** The GiST Penalty method for boxes
** As in the R-tree paper, we use change in area as our penalty metric
*/
float *
g_cube_penalty(GISTENTRY *origentry, GISTENTRY *newentry, float *result)
{
NDBOX *ud;
double tmp1,

Again, the postgis code seems to work fine with GiST indexes, so I would
rule out the argument passing mechanism as being of influence.

closely associated with this issue, but without details it's impossible
to say more. IIRC there were once assumptions in rtree and/or gist
about the indexable datatype being pass-by-ref and/or varlena, and I'm
not sure how much of that has been fixed.

This seems a bit far afield for pgsql-general, however. I would
suggest offering more details on -hackers. If you could post your code
for people to play with, you'd likely get answers pretty quickly.
I figured likewise, but the etiquete for the postgresql.org group required
I try the more general group first. :-).
I'll give that a shot now.
regards, tom lane


**********************************************
Eric Davies, M.Sc.
Barrodale Computing Services Ltd.
Tel: (250) 472-4372 Fax: (250) 472-4373
Web: http://www.barrodale.com
Email: er**@barrodale.com
**********************************************
Mailing Address:
P.O. Box 3075 STN CSC
Victoria BC Canada V8W 3W2

Shipping Address:
Hut R, McKenzie Avenue
University of Victoria
Victoria BC Canada V8W 3W2
**********************************************

Nov 23 '05 #4

This discussion thread is closed

Replies have been disabled for this discussion.