473,782 Members | 2,439 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

gist index build produces corrupt result on first access totable.

We've implemented a 5D box data type and have implemented both RTree and
GiST access methods under PostgresSQL 7.4 and PostgresSQL 7.4.1. The 5D box
internally looks like:
struct Box5D{
float minBounds[5];
float maxBounds[5];
};
and so takes up 40 bytes and is of fixed length. The GiST access methods
are basically a generalization of the 2D box indexing access methods from
Refraction Research's PostGIS.

We've tested this by building an index on a table that contains 30000 rows
(no nulls) representing small non-overlapping boxes on a 2D plane (ie, all
the values for indices 0 and 1 vary; the values for indices 2,3, and 4 are
constant). Then we try a select on the table with an overlaps test in the
where clause that should result in 66 rows being returned.
We've used an "explain select ..... " to verify that the index is used
during the search.

An RTree index built on the table works fine, always. The ability to build
a GiST index is less reliable; sometimes the index is built properly and
subsequent searchs on it work perfectly. Sometimes the index is built
incorrectly, and subsequent searchs on it fail. We have examined the keys
passed to our GiST consistent access method during the select and have
observed that when the index hasn't been built properly, the first key
passed to the consistent method is gibberish (strange values for indices 0
and 1, strange nonzero values for indices 2,3 and 4). When the index has
been built properly, the first key passed to the consistent method has
reasonable looking values.

We've been able to isolate this behavior somewhat: the index build only
seems to produce corrupt results if the data connection that sent the
command did not perform any preceding statements that would have caused the
boxes to have been read from disk. For example, the sequence:
<open connection to db>
create index box_index on box_table using gist(box_column );

would produce a corrupt index.
However, the sequence

<open connection to db>
select count(*) from box_table where box_column && '0 0 0 0 0 1 1 1 1 1';
create index box_index on box_table using gist(box_column );

produces a valid index,
as does the sequence:

<open connection to db>
create index box_index on box_table using gist(box_column );
drop index box_index;
create index box_index on box_table using gist(box_column );

In short, it appears that an operation that forces the boxes to read from a
table
needs to be done to "prime the index building pump".

We've tried installing PostGIS and it does not seem to be affected by any
similar problems.
The notable differences between our code and Refraction's are:
* our input geometries are fixed length, rather than variable length
* our compress method just returns the GISTENTRY * it was passed as an
argument. We've tried returning a deep copy of the GISENTRY *, but that
made no difference.
We've looked at the contrib/cube code, but it's also a variable length
parameter, and it uses a very different parameter passing mechanism.
In the short term, we can work around the issue, but it would be nice to be
able to fix the problem. Any ideas are welcome.

Thank you.
Eric.

*************** *************** *************** *
Eric Davies, M.Sc.
Barrodale Computing Services Ltd.
Tel: (250) 472-4372 Fax: (250) 472-4373
Web: http://www.barrodale.com
Email: er**@barrodale. com
*************** *************** *************** *
Mailing Address:
P.O. Box 3075 STN CSC
Victoria BC Canada V8W 3W2

Shipping Address:
Hut R, McKenzie Avenue
University of Victoria
Victoria BC Canada V8W 3W2
*************** *************** *************** *

Nov 23 '05 #1
3 2215
Eric Davies <er**@barrodale .com> writes:
We've looked at the contrib/cube code, but it's also a variable length
parameter, and it uses a very different parameter passing mechanism.


Different how? I have a hard-to-pin-down intuition that your problem is
closely associated with this issue, but without details it's impossible
to say more. IIRC there were once assumptions in rtree and/or gist
about the indexable datatype being pass-by-ref and/or varlena, and I'm
not sure how much of that has been fixed.

This seems a bit far afield for pgsql-general, however. I would
suggest offering more details on -hackers. If you could post your code
for people to play with, you'd likely get answers pretty quickly.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 23 '05 #2


Tom Lane wrote:
Eric Davies <er**@barrodale .com> writes:
We've looked at the contrib/cube code, but it's also a variable length
parameter, and it uses a very different parameter passing mechanism.

Different how? I have a hard-to-pin-down intuition that your problem is
closely associated with this issue, but without details it's impossible
to say more. IIRC there were once assumptions in rtree and/or gist
about the indexable datatype being pass-by-ref and/or varlena, and I'm
not sure how much of that has been fixed.

contribs rtree_gist, btree_gist and seg uses pass-by-ref fixed length value for
storage, and there isn't such problem.
It seems to me, the problem may be in union or picksplit methods, but I needs
more info.

I advice to play with gevel module (
http://www.sai.msu.su/~megera/postgr...l/README.gevel,
http://www.sai.msu.su/~megera/postgr...l/gevel.tar.gz
)

--
Teodor Sigaev E-mail: te****@sigaev.r u

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 23 '05 #3

At 10:28 PM 3/3/2004, Tom Lane wrote:
Eric Davies <er**@barrodale .com> writes:
We've looked at the contrib/cube code, but it's also a variable length
parameter, and it uses a very different parameter passing mechanism.
Different how? I have a hard-to-pin-down intuition that your problem is


the parameter passing mechanism used by Postgis (and our 5D box) is to
pass a FunctionCallInf oData structure, via a PG_FUNCTION_ARG S macro.
The Postgis penalty function starts with:
/*
** The GiST Penalty method for boxes
** As in the R-tree paper, we use change in area as our penalty metric
*/
PG_FUNCTION_INF O_V1(gbox_penal ty);
Datum gbox_penalty(PG _FUNCTION_ARGS)
{
GISTENTRY *origentry = (GISTENTRY *) PG_GETARG_POINT ER(0);

the cube code appears to pass each argument separately, as one would
normally do with a
C program. Ie, the g_cube_penalty function starts with:
/*
** The GiST Penalty method for boxes
** As in the R-tree paper, we use change in area as our penalty metric
*/
float *
g_cube_penalty( GISTENTRY *origentry, GISTENTRY *newentry, float *result)
{
NDBOX *ud;
double tmp1,

Again, the postgis code seems to work fine with GiST indexes, so I would
rule out the argument passing mechanism as being of influence.

closely associated with this issue, but without details it's impossible
to say more. IIRC there were once assumptions in rtree and/or gist
about the indexable datatype being pass-by-ref and/or varlena, and I'm
not sure how much of that has been fixed.

This seems a bit far afield for pgsql-general, however. I would
suggest offering more details on -hackers. If you could post your code
for people to play with, you'd likely get answers pretty quickly.
I figured likewise, but the etiquete for the postgresql.org group required
I try the more general group first. :-).
I'll give that a shot now.
regards, tom lane


*************** *************** *************** *
Eric Davies, M.Sc.
Barrodale Computing Services Ltd.
Tel: (250) 472-4372 Fax: (250) 472-4373
Web: http://www.barrodale.com
Email: er**@barrodale. com
*************** *************** *************** *
Mailing Address:
P.O. Box 3075 STN CSC
Victoria BC Canada V8W 3W2

Shipping Address:
Hut R, McKenzie Avenue
University of Victoria
Victoria BC Canada V8W 3W2
*************** *************** *************** *

Nov 23 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
2743
by: Dmitry Tkach | last post by:
Hi, everybody! I am trying to create a custom GiST index in 7.3, but getting an error, that I don't know how to interpret: testdb=# create table gist_test (field int8); CREATE TABLE testdb=# create index gist_idx on gist_test using gist (field); ERROR: data type bigint has no default operator class for access method "gist"
0
1352
by: George Essig | last post by:
I have installed tsearch2 and have noticed that the gist index used to do searches grows and grows as I update rows, delete rows, or run VACUUM FULL ANALYZE. Below are some details: PostgreSQL 7.4RC1 Red Hat 9 Table "public.series" Column | Type | Modifiers ---------------+-------------------+-------------------------------------------------------- id | integer | not null...
14
5423
by: Sean C. | last post by:
Helpful folks, Most of my previous experience with DB2 was on s390 mainframe systems and the optimizer on this platform always seemed very predictable and consistent. Since moving to a WinNT/UDB 7.2 environment, the choices the optimizer makes often seem flaky. But this last example really floored me. I was hoping someone could explain why I get worse response time when the optimizer uses two indexes, than when it uses one. Some context:
8
5265
by: Mike | last post by:
Hello, I have a few rather urgent questions that I hope someone can help with (I need to figure this out prior to a meeting tomorrow.) First, a bit of background: The company I work for is developing a web-based application, one part of which involves allowing the user the ability to page through transaction "history" information. The _summary_ history table will have the following fields: ServiceName, Date, User-Ref1, User-Ref2,...
4
5083
by: aaj | last post by:
Hi all I have been given a corrupt access 2000 backend and asked to salvage the data.(youve gussed it, no backup) On trying to open it just kicks me out saying its not a valid database file - so I can't even use the internal compact/repair function I have tried the usal stuff i.e. Trying to repair via another access program using dao
25
1750
by: kapilk | last post by:
Sir, I know that the array index starts in C from 0 and not 1 can any body pls. tell me the reason. Is it because in the subscript i can have a unsigned integer and these start from 0 Thanks
10
2314
by: Reynard Hilman | last post by:
Hi, I have been having this problem where the database size suddenly grows from the normal size of about 300Mb to 12Gb in one night. When I look up the table size, the biggest one is only 41Mb and the total of all table size is only 223Mb. But in the filesystem data directory the total size is 12Gb. I noticed there are 10 files with 1Gb size each: 1.1G 25677563 1.1G 25677563.1
3
1602
by: Net Virtual Mailing Lists | last post by:
Hello, I have a table like this with some indexes as identified: CREATE TABLE sometable ( data TEXT, data_fti TSVECTOR, category1 INTEGER, category2 INTEGER,
4
2283
by: BravoFoxtrot | last post by:
Hi, I'm trying to build an index into a multi dimensional associative array. I may not know how many dimensions there are so i want to pass the array indexes as a variable. $arrayToAccess = array( array('Data'=>array('id'=>1, 'title'=>'Manager')), array('Data'=>array('id'=>1, 'title'='Clerk')) );
0
9479
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10311
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10146
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10080
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9942
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8967
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7492
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
2
3639
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2874
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.