473,699 Members | 2,431 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

SELECT from a set of values really slow?

I've been benchmarking some very simple databases. By simple, I mean a table
like this:

CREATE TABLE bench (
id SERIAL,
data TEXT )
CREATE INDEX bench_data_inde x ON bench (data)

which is filled with 100k records, where the data values for each record are
distinct (the data for record N is "text_item_ N").

I generate a random data value known to be in the table and lookup its id,
with "SELECT id FROM bench WHERE data = 'whatever'".

I run this for three seconds. PostgreSQL does OK: 4300 selects in 2999
msec. MySQL does a little better, but not much: 5500 selects in 2999 msec.

In my application, this kind of lookup turns out to be the bottleneck on
performance (which is why I'm benchmarking). On MySQL, I can speed it up
quite a bit by looking up more than one thing at a time:

SELECT id FROM bench WHERE data IN ('X', 'Y')

for example, gets about 4050 selects in 3 seconds, so that's 8100 records
looked up, compared to 5300 when they were done one at a time. It continues
to improve selecting more at a time. To my surprise, when I tried this
trick with PostgreSQL, it did not speed things up. In fact, it *massively*
slowed down--it only is getting 13 selects in 3 seconds, searching for two
at a time.

What's going on here?

PostgreSQL 7.4.2 on SuSE 9.1 Linux. (The version that comes with SuSE).

--
--Tim Smith
Nov 23 '05 #1
3 1843
CoL
hi,

Tim Smith wrote:

What's going on here?


#>explain analyze SELECT id FROM bench WHERE data IN ('X', 'Y');
#>set enable_seqscan to off;
#>explain analyze SELECT id FROM bench WHERE data IN ('X', 'Y');

result?

C.
Nov 23 '05 #2
Tim Smith <re************ @mouse-potato.com> writes:
SELECT id FROM bench WHERE data IN ('X', 'Y') To my surprise, when I tried this trick with PostgreSQL, it did not
speed things up. In fact, it *massively* slowed down--it only is
getting 13 selects in 3 seconds, searching for two at a time. What's going on here?


Likely it's switching from index to sequential scan because of a poor
estimate of how many rows will be returned. Have you ever ANALYZEd
the test table? Without either ANALYZE stats or a unique index,
the planner will certainly not think that the column is unique.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 23 '05 #3
test=> insert into bench (id,data) select id, 'text_item_'||i d::text from
dummy where id<=100000 order by id;
INSERT 0 100001
test=> CREATE INDEX bench_data_inde x ON bench (data);
CREATE INDEX
test=> explain select * from bench where data = 'test_item_1';
Index Scan using bench_data_inde x on bench (cost=0.00..147 8.85 rows=501
width=36)
Index Cond: (data = 'test_item_1':: text)
(2 lignes)

test=> explain select * from bench where data in ( 'test_item_1',
'test_item_2' );
Seq Scan on bench (cost=0.00..219 0.01 rows=998 width=36)
Filter: ((data = 'test_item_1':: text) OR (data = 'test_item_2':: text))
(2 lignes)

test=> vacuum analyze bench;
VACUUM
test=> explain select * from bench where data in ( 'test_item_1',
'test_item_2' );
Index Scan using bench_data_inde x, bench_data_inde x on bench
(cost=0.00..7.9 1 rows=2 width=22)
Index Cond: ((data = 'test_item_1':: text) OR (data =
'test_item_2':: text))
(2 lignes)

=> once you analyze, it works...
*************** *************** *************** **

With 1 item :

test=>explain analyze select * from bench where data in ( 'test_item_1' );
Total runtime: 0.127 ms

With 11 items :

test=>explain analyze select * from bench where data in ( 'test_item_1',
'test_item_2', 'test_item_55', 'test_item_64', 'test_item_1005 ',
'test_item_7854 1', 'test_item_9652 1', 'test_item_8574 ', 'test_item_8965 2',
'test_item_1452 7', 'test_item_4865 2' );
Total runtime: 0.352 ms

*************** *************** *************** **

With a Join... see on psql-performance

On Sun, 10 Oct 2004 16:00:10 -0400, Tom Lane <tg*@sss.pgh.pa .us> wrote:
Tim Smith <re************ @mouse-potato.com> writes:
SELECT id FROM bench WHERE data IN ('X', 'Y')

To my surprise, when I tried this trick with PostgreSQL, it did not
speed things up. In fact, it *massively* slowed down--it only is
getting 13 selects in 3 seconds, searching for two at a time.

What's going on here?


Likely it's switching from index to sequential scan because of a poor
estimate of how many rows will be returned. Have you ever ANALYZEd
the test table? Without either ANALYZE stats or a unique index,
the planner will certainly not think that the column is unique.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html


---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postg resql.org

Nov 23 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

11
2270
by: Jeff Sandler | last post by:
I need a MySQL select statement as part of a PHP script. I want to find rows where a certain column either starts with or equals a user-supplied string. The string will be 1 or more characters in length. The sending page allows the user to choose which search criteria he wants, so this (and many other) pieces of the select statement may or may not be present. The statement always starts with SELECT * from logdata where driver =...
3
2056
by: Sven Reifegerste | last post by:
Hi, i have a table with INT columns id,key,b1,b2,c1,c2, having 1.500.000 rows. 'key' and 'id' are indexed (Kardinality 385381) and id (Kardinality 1541525). Performing a SELECT * FROM table WHERE key IN (10,11,12) OR key BETWEEN 20000 AND 28000 AND b1<4500000 AND b2>3954545 AND c1<4543554 AND c2>4400000
2
9243
by: Benoit Le Goff | last post by:
Hello. I test some query on sql server 2000 (sp2 on OS windows 2000) and i want to know why a simple query like this : select * from Table Where Column like '%value' is more slow on 2000 than on sql 7. And this case arrive only if the % character is in the begin. If you test this : select * from Table Where Column like 'v%alue'
12
6548
by: Kevin Lyons | last post by:
Hello, I am trying to get my select options (courses) passed correctly from the following URL: http://www.dslextreme.com/users/kevinlyons/selectBoxes.html I am having difficulty getting the courses to pass the correct option value and then be displayed at the following URL: http://www.dslextreme.com/users/kevinlyons/selectResults.html I am passing countries, products, and courses. The first two display
1
2235
by: David Lawson | last post by:
The line indicated below from my php script is very slow (about 10 seconds). I have this field indexed so I thought that it would be much faster. Could someone tell me what might be wrong? I'm also including the dump of the table definitions. This is a cd cataloging database. Right now the filenames table is empty and I'm trying to populate it, but at the rate it's going it would take days. I have about 700,000 records in the 'files'...
8
11104
by: Rich | last post by:
My table looks like this: char(150) HTTP_REF, char(250) HTTP_USER, char(150) REMOTE_ADDR, char(150) REMOTE_HOST, char(150) URL, smalldatetime TIME_STAMP There are no indexes on this table and there are only 293,658 records total.
4
2271
by: Ian Richardson | last post by:
Hi, The function I've put together below is a rough idea to extend a SELECT list, starting from: <body> <form name="bambam"> <select id="fred"> <option value="1">1</option> <option value="2">2</option>
22
12476
by: MP | last post by:
vb6,ado,mdb,win2k i pass the sql string to the .Execute method on the open connection to Table_Name(const) db table fwiw (the connection opened via class wrapper:) msConnString = "Data Source=" & msDbFilename moConn.Properties("Persist Security Info") = False moConn.ConnectionString = msConnString moConn.CursorLocation = adUseClient moConn.Mode = adModeReadWrite' or using default...same result
1
4325
by: Mikle | last post by:
I need to update values in a table (where they are null or zero) with values from another table for the corresponding PopulationMemberID. Now I now that you can't update where there is a join.. but the join is in the nested select.. so it should work right? CurrentDb.Execute ("UPDATE Samples SET samples.SampleMemberEmployees=(SELECT pop.PopMemberEmployees FROM population as pop WHERE .=. ) WHERE nz(sampleMemberEmployees,0)=0 AND
0
8687
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9174
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9034
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
8914
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8883
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
6534
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
4376
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4629
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3057
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.