473,545 Members | 2,782 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Index problem.... GIST (tsearch2)

Hello,

I have a table like this with some indexes as identified:
CREATE TABLE sometable (
data TEXT,
data_fti TSVECTOR,
category1 INTEGER,
category2 INTEGER,
category3 INTEGER
);

CREATE OR REPLACE FUNCTION is_null(anyelem ent) RETURNS BOOLEAN AS 'SELECT
$1 IS NULL;' LANGUAGE 'SQL' IMMUTABLE;
CREATE FUNCTION sometable_categ ory1_idx ON sometable (category1);
CREATE FUNCTION sometable_categ ory2_idx ON sometable (category2);
CREATE FUNCTION sometable_categ ory3_idx ON sometable (category3);

CREATE FUNCTION sometable_data_ fti_idx ON sometable USING gist(data_fti);
When I do a query like this, it uses sometable_categ ory1_idx and is very
fast (it only returns a few rows out of several thousand)

SELECT * from sometable WHERE is_null(categor y1)='f';

When I do a query like this though it is slow because it insists on doing
the full-text index first:

SELECT * from sometable WHERE is_null(categor y1)='f' AND data_fti @@
to_tsquery('def ault', 'postgres');
How can I make this query first use the is_null index?... It strikes me
that this would almost always be faster then doing the full-text search
first, right?...
Thanks!

- Greg


---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postg resql.org

Nov 23 '05 #1
3 1581

"Net Virtual Mailing Lists" <ma**********@n et-virtual.com> writes:
SELECT * from sometable WHERE is_null(categor y1)='f' AND data_fti @@
to_tsquery('def ault', 'postgres');

How can I make this query first use the is_null index?... It strikes me
that this would almost always be faster then doing the full-text search
first, right?...


Well that depends on how many are false versus how many the full-text search
finds.

In this circumstance postgres is trying to compare two unknowns. It doesn't
know how often is_nul() is going to return false, and it doesn't know how many
records the full text search will match.

8.0 will have statistics on how often is_null() will return false. But that
isn't really going to solve your problem since it still won't have any idea
how many rows the full text search will find.

I don't even know of anything you can do to influence the selectivity
estimates of the full text search.

--
greg
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddres sHere" to ma*******@postg resql.org)

Nov 23 '05 #2
Greg Stark <gs*****@mit.ed u> writes:
8.0 will have statistics on how often is_null() will return false. But that
isn't really going to solve your problem since it still won't have any idea
how many rows the full text search will find. I don't even know of anything you can do to influence the selectivity
estimates of the full text search.


Write some code ;-) ?

Seriously, we desperately need some people thinking about how to do
statistics and selectivity estimates for these sorts of complex
indexable conditions. Even crude estimates would be better than none
at all, which is where we're at now. I think that as of 8.0 there is
sufficient infrastructure in place to collect datatype-specific stats
and do something with them --- but *what* to do is now the pressing
problem.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 23 '05 #3
"Net Virtual Mailing Lists" <ma**********@n et-virtual.com> writes:
I have a table like this with some indexes as identified: CREATE OR REPLACE FUNCTION is_null(anyelem ent) RETURNS BOOLEAN AS 'SELECT
$1 IS NULL;' LANGUAGE 'SQL' IMMUTABLE;
CREATE FUNCTION sometable_categ ory1_idx ON sometable (category1);
CREATE FUNCTION sometable_categ ory2_idx ON sometable (category2);
CREATE FUNCTION sometable_categ ory3_idx ON sometable (category3); CREATE FUNCTION sometable_data_ fti_idx ON sometable USING gist(data_fti);


[ raises eyebrow... ] It'd be easier to offer advice if you accurately
depicted what you'd done. The above isn't even syntactically valid.

I suppose what you meant is

CREATE INDEX sometable_categ ory1_idx ON sometable (is_null(catego ry1));

The main problem with this is that before 8.0 there are no stats on
functional indexes, and so the planner has no idea that the condition
is_null(categor y1)='f' is very selective. (If you looked at the
rowcount estimates from EXPLAIN this would be pretty obvious.)

What I would suggest is that you forget the functional indexes and use
partial indexes:

CREATE INDEX sometable_categ ory1_idx ON sometable (category1)
WHERE category1 IS NOT NULL;

SELECT * from sometable WHERE category1 IS NOT NULL AND data_fti @@
to_tsquery('def ault', 'postgres');

7.4 has a reasonable chance of figuring out that the category1_idx
is the thing to use if you cast it this way.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 23 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
2723
by: Dmitry Tkach | last post by:
Hi, everybody! I am trying to create a custom GiST index in 7.3, but getting an error, that I don't know how to interpret: testdb=# create table gist_test (field int8); CREATE TABLE testdb=# create index gist_idx on gist_test using gist (field); ERROR: data type bigint has no default operator class for access method "gist"
6
1964
by: Jeff Davis | last post by:
After installing PostgreSQL 7.4 beta 1 from source, I decided to install the /contrib module tsearch2. I cd to the tsearch2 directory and typed "make", however I get an error that yy_current_buffer is an undeclared identifier in wordparser/parser.c (which is apparently autogenerated with flex from parser.l). I looked online, and it seemed...
1
3420
by: Johann Uhrmann | last post by:
Hello, are there any experiences about the performance of indices with different data types. How do the performance of an index that consists of - an integer field - a varchar() field - a text field
1
1749
by: psql-mail | last post by:
I have applied the recent tsearch2 patch and recompiled the tsearch2 module but I am still experiencing the same backend crashes as I previously described. Thanks for any help, Mat GDB output:
0
1331
by: George Essig | last post by:
I have installed tsearch2 and have noticed that the gist index used to do searches grows and grows as I update rows, delete rows, or run VACUUM FULL ANALYZE. Below are some details: PostgreSQL 7.4RC1 Red Hat 9 Table "public.series" Column | Type | Modifiers...
0
446
by: konf | last post by:
Well, now I tried to compile with tsearch2. I do in src direcotry: ../configure then change into contrib/tsearch2 and do make (as I read in manual) and I got:
3
2187
by: Eric Davies | last post by:
We've implemented a 5D box data type and have implemented both RTree and GiST access methods under PostgresSQL 7.4 and PostgresSQL 7.4.1. The 5D box internally looks like: struct Box5D{ float minBounds; float maxBounds; }; and so takes up 40 bytes and is of fixed length. The GiST access methods are basically a generalization of the 2D box...
10
2273
by: Reynard Hilman | last post by:
Hi, I have been having this problem where the database size suddenly grows from the normal size of about 300Mb to 12Gb in one night. When I look up the table size, the biggest one is only 41Mb and the total of all table size is only 223Mb. But in the filesystem data directory the total size is 12Gb. I noticed there are 10 files with 1Gb...
2
1417
by: Abandoned | last post by:
Hi.. I want to do index in postgresql & python. My table: id(int) | id2(int) | w(int) | d(int) My query: select id, w where id=x and id2=y (sometimes and d=z) I have too many insert and select operation on this table. And which index type can i use ? Btree, Rtree, Gist or Hash ?
0
7432
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
7689
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
0
7943
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
0
7786
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
1
5359
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
3490
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
1
1919
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1044
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
743
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.