473,320 Members | 1,828 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

Can LIKE use indexes or not?

Reading the archives and the FAQ, it seems to be implied that LIKE can
use index (and ILIKE can't; so to do case-insensitive search you need to
create a functional index on LOWER(field) and say: LOWER(field) LIKE
'foo%').

However, EXPLAIN always says seq scan for the test data I'm using. I've
done 'set enable_seqscan to off' and it still says seq scan. I was
curious as to how the index will help this query:

db1=> set enable_seqscan to off;
SET
Time: 5.732 ms
db1=> explain select * from t where f like 'xx%';
QUERY PLAN
-------------------------------------------------------------------
Seq Scan on t (cost=100000000.00..100002698.90 rows=89 width=14)
Filter: (f ~~ 'xx%'::text)
(2 rows)

db1=> explain select * from t where lower(f) like 'xx%';
QUERY PLAN
--------------------------------------------------------------------
Seq Scan on t (cost=100000000.00..100002893.68 rows=390 width=14)
Filter: (lower(f) ~~ 'xx%'::text)
(2 rows)

The table is:

db1=> \d t
Table "public.t"
Column | Type | Modifiers
--------+------+-----------
f | text |
Indexes:
"i1" unique, btree (lower(f))
"i2" unique, btree (f)

It contains +- 250k rows of totally random 10-char-long strings
(containing upper- & lowercase letters and numbers). Here's how the LIKE
performs:

db1=> select * from t where f like 'xx%';
f
------------
xxEqfLZMkH
xxBRRnLetJ
...
xxFPYJEiYf
(98 rows)

Time: 452.613 ms

Would using an index potentially help the performance of this query, and
if yes, how do I force Postgres to use the index?

db1=> select * from t where lower(f) like 'mmm%';
f
------------
MmmyEVmfSY
MMmzolhHtq
...
mMMWEQzlKm
(16 rows)

Time: 634.470 ms

--
dave
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 22 '05 #1
10 2088
David Garamond said:
Would using an index potentially help the performance of this query, and
if yes, how do I force Postgres to use the index?

db1=> select * from t where lower(f) like 'mmm%';


I suspect the fact that you're specifying the lower function on the column
data, ie lower(f), implies that the function has to be applied to every
row in the table in order to calculate the value prior to testing the like
condition.

I don't know enough about what you can and cannot do index-wise in PG, in
terms of creating an index based on a computed (upper/lower) value of a
column.

But you could consider adding an extra column to the table and a trigger
so that the trigger places an UPPER or LOWER version of the column "f"
into the new column.

Like searches would then be

select * from t where new_upper_f like upper('MMM%');

Provided that there is an index on the new column, new_upper_f, you should
avoid the full table scan. (I think, I haven't tested this out)...

John Sidney-Woollett
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 22 '05 #2
try this:
CREATE [ UNIQUE ] INDEX my_index ON t ( lower(f));

John Sidney-Woollett wrote:
David Garamond said:

Would using an index potentially help the performance of this query, and
if yes, how do I force Postgres to use the index?

db1=> select * from t where lower(f) like 'mmm%';


I suspect the fact that you're specifying the lower function on the column
data, ie lower(f), implies that the function has to be applied to every
row in the table in order to calculate the value prior to testing the like
condition.

I don't know enough about what you can and cannot do index-wise in PG, in
terms of creating an index based on a computed (upper/lower) value of a
column.

But you could consider adding an extra column to the table and a trigger
so that the trigger places an UPPER or LOWER version of the column "f"
into the new column.

Like searches would then be

select * from t where new_upper_f like upper('MMM%');

Provided that there is an index on the new column, new_upper_f, you should
avoid the full table scan. (I think, I haven't tested this out)...

John Sidney-Woollett
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 22 '05 #3
John Sidney-Woollett said:
select * from t where new_upper_f like upper('MMM%');


I think I meant

select * from t where new_upper_f like 'MMM%';

or

select * from t where new_upper_f like upper('mmm%');

John

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 22 '05 #4
If you use an exact = does it use the index?
e.g. explain select ... where lower(f)='xxxxxxxx'

If so it could be your locale setting. On some versions of Postgresql like
is disabled on non-C locales. On some versions of Postgresql on some
platforms the default is a non-C locale. With version 7.4 you can
workaround that:
http://www.postgresql.org/docs/curre...s-opclass.html

Hope that helps,

At 03:30 PM 2/5/2004 +0700, David Garamond wrote:
Reading the archives and the FAQ, it seems to be implied that LIKE can use
index (and ILIKE can't; so to do case-insensitive search you need to
create a functional index on LOWER(field) and say: LOWER(field) LIKE 'foo%').

However, EXPLAIN always says seq scan for the test data I'm using. I've
done 'set enable_seqscan to off' and it still says seq scan. I was curious
as to how the index will help this query:

db1=> set enable_seqscan to off;
SET
Time: 5.732 ms
db1=> explain select * from t where f like 'xx%';
QUERY PLAN
-------------------------------------------------------------------
Seq Scan on t (cost=100000000.00..100002698.90 rows=89 width=14)
Filter: (f ~~ 'xx%'::text)
(2 rows)

db1=> explain select * from t where lower(f) like 'xx%';
QUERY PLAN
--------------------------------------------------------------------
Seq Scan on t (cost=100000000.00..100002893.68 rows=390 width=14)
Filter: (lower(f) ~~ 'xx%'::text)
(2 rows)


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 22 '05 #5
Lincoln Yeoh wrote:
If you use an exact = does it use the index?
e.g. explain select ... where lower(f)='xxxxxxxx'
Yes it does.
If so it could be your locale setting. On some versions of Postgresql
like is disabled on non-C locales.
I'm using 7.4.1. These are the lines in postgresql.conf (it's basically
pristine from the one created by initdb).

# These settings are initialized by initdb -- they may be changed
lc_messages = 'en_US.iso885915' #locale for system error message strings
lc_monetary = 'en_US.iso885915' #locale for monetary formatting
lc_numeric = 'en_US.iso885915' #locale for number formatting
lc_time = 'en_US.iso885915' #locale for time formatting
On some versions of Postgresql on
some platforms the default is a non-C locale. With version 7.4 you can
workaround that:
http://www.postgresql.org/docs/curre...s-opclass.html
Yes, that was the cause of the problem. I've now recreated the index
using the varchar_pattern_ops:

db1=> create unique index i1 on t(i varchar_pattern_ops);
db1=> create unique index i2 on t(lower(i) varchar_pattern_ops);

and now EXPLAIN tells me the query uses Index scan:

db1=> explain select * from t where f like 'xx%';
QUERY PLAN
--------------------------------------------------------------------------------------
Index Scan using i1 on t (cost=0.00..6.01 rows=322 width=14)
Index Cond: ((f ~>=~ 'xx'::character varying) AND (f ~<~
'xy'::character varying))
Filter: (f ~~ 'xx%'::text)
(3 rows)

db1=> explain select * from t where lower(f) like 'xx%';
QUERY PLAN
----------------------------------------------------------------------------------------------------
Index Scan using i2 on t (cost=0.00..4049.64 rows=1421 width=14)
Index Cond: ((lower(f) ~>=~ 'xx'::character varying) AND (lower(f)
~<~ 'xy'::character varying))
Filter: (lower(f) ~~ 'xx%'::text)
(3 rows)
Hope that helps,


Yes it does, thanks. Apparently using the index does improve the speed:

db1=> select * from t where f like 'xx%';
f
------------
xxAGRrXrXr
xxAwScNpWh
...
xxyuFyyDtn
(98 rows)

Time: 9.679 ms

db1=> select * from t where lower(f) like 'xx%';
f
------------
xxaAvoarIZ
XXadJWnXcK
...
xXzynzWllI
(413 rows)

Time: 8.626 ms

--
dave
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 22 '05 #6
On Thursday 05 February 2004 10:25, David Garamond wrote:

Glad to see your problem is solved. Your locale/charset settings look a bit
odd though:
# These settings are initialized by initdb -- they may be changed
lc_messages = 'en_US.iso885915' #locale for system error message strings
lc_monetary = 'en_US.iso885915' #locale for monetary formatting
lc_numeric = 'en_US.iso885915' #locale for number formatting
lc_time = 'en_US.iso885915' #locale for time formatting


US English with 8859-15 (Latin 9?) charset? Can I as what OS this is? It's
just that it seems like an odd combination to me - or am I displaying my
ignorance here?

--
Richard Huxton
Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 22 '05 #7
Richard Huxton wrote:
On Thursday 05 February 2004 10:25, David Garamond wrote:

Glad to see your problem is solved. Your locale/charset settings look a bit
odd though:
# These settings are initialized by initdb -- they may be changed
lc_messages = 'en_US.iso885915' #locale for system error message strings
lc_monetary = 'en_US.iso885915' #locale for monetary formatting
lc_numeric = 'en_US.iso885915' #locale for number formatting
lc_time = 'en_US.iso885915' #locale for time formatting


US English with 8859-15 (Latin 9?) charset? Can I as what OS this is? It's
just that it seems like an odd combination to me - or am I displaying my
ignorance here?


It's Redhat 7.3 running under VMWare Workstation 4.0.* The host OS is
Windows 2000 + SP4. All software are pretty much left to their defaults.

* Of course, we use "real" Linux for production; this is just my home
machine.

--
dave
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 22 '05 #8
Richard Huxton wrote:
US English with 8859-15 (Latin 9?) charset? Can I as what OS this is?
It's just that it seems like an odd combination to me - or am I
displaying my ignorance here?


FYI, 8859-15 is just 8859-1 with the Euro sign in place of the generic
currency sign and some other minor changes. As you might imagine, in
Europe it is (becoming?) the default, and since most people in the rest
of the world probably won't ever notice the difference, you can expect
it to become the preferred choice of operating systems there as well.
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 22 '05 #9
On Thursday 05 February 2004 20:00, Peter Eisentraut wrote:
Richard Huxton wrote:
US English with 8859-15 (Latin 9?) charset? Can I as what OS this is?
It's just that it seems like an odd combination to me - or am I
displaying my ignorance here?


FYI, 8859-15 is just 8859-1 with the Euro sign in place of the generic
currency sign and some other minor changes. As you might imagine, in
Europe it is (becoming?) the default, and since most people in the rest
of the world probably won't ever notice the difference, you can expect
it to become the preferred choice of operating systems there as well.


Yeah - Harald Fuchs said the same thing (but it seems to have drifted off
list). I was getting mixed up with 8859-14 (Latin 8) which is the Celtic one
- blame it on being a Welshman.

--
Richard Huxton
Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 22 '05 #10
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Why don't you use

select * from t where new_upper_f ilike 'mmm%';

The ilike operator ignores case and you would get rid of the function call.

On Thursday 05 February 2004 01:20 am, John Sidney-Woollett wrote:
John Sidney-Woollett said:
select * from t where new_upper_f like upper('MMM%');


I think I meant

select * from t where new_upper_f like 'MMM%';

or

select * from t where new_upper_f like upper('mmm%');

John

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend


- --
UC

- --
Open Source Solutions 4U, LLC 2570 Fleetwood Drive
Phone: +1 650 872 2425 San Bruno, CA 94066
Cell: +1 650 302 2405 United States
Fax: +1 650 872 2417
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQFAI+6wjqGXBvRToM4RAgY8AKCmTr8JQpbDd+FMEZCG1I TVIYH+CQCeJ45C
IHZxeLxj60UL+TlhXfH6pdM=
=JOOY
-----END PGP SIGNATURE-----
---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 22 '05 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Paul | last post by:
I'm confused about something I did this week. I have an 8.1.7.4 database with tables partitioned by dates. I recently dropped my maxvalue tablespace, added a few more partitions for up-coming dates,...
2
by: Chris | last post by:
Any help would be appreciated. I am running a script that does the following in succession. 1-Drop existing database and create new database 2-Defines tables, stored procedures and functions...
1
by: Steve_CA | last post by:
Hi, The more I read, the more confused I'm getting ! (no wonder they say ignorance is bliss) I just got back from the bookstore and was flipping through some SQL Server Administration...
9
by: Igor | last post by:
Is there a way to force optimizer to use indexes without hints? (some server setting or index type...) I'll give an example to clarify : I have a table with fields Customer_Code char(10) not...
9
by: pheonix1t | last post by:
hello, I've been assigned to do performance tuning on an SQL2000 database (around 10GB in size, several instances). So far, I see a single RAID5 array, 4CPU (xeon 700MHZ), 4GB RAM. I see the...
4
by: mairhtin o'feannag | last post by:
Hello, I want to ask the question "how large is each of the indexes in a particular tablespace??" since I want to know which user-created indexes are taking up huge amounts of space. I used the...
14
by: Jeff | last post by:
This is the first time that I remember ever having too many indexes on a table, but it has happened. I have en employees table and store in many places, on tables, the id of the employee that...
17
by: boa | last post by:
I'm currently planning disk layouts and use for a new version of our database. The current version has all data and indexes in the default filegroup, placed on one big raid-5 array(6 drives) along...
10
by: lesperancer | last post by:
you start with a small application in access97, then you have more modules and more... and you reach the point where tables like 'item' and 'employee' reach the limit and you know there's more...
0
MMcCarthy
by: MMcCarthy | last post by:
The more data you include in your tables the more you will need to have indexes to search and sort that data. However, there is a balance between having enough indexes and too many. Too many...
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.