472,119 Members | 1,825 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,119 software developers and data experts.

"select count(*) from contacts" is too slow!

Why does '*select count(id) from "tblContacts"'* do a sequential scan
when the field '*id*' is indexed using a btree?

MySql simply looks at the index which is keeping a handy record of the
number of rows.

Can anybody explain how and why postgres does this query like it does?

Many thanks

Paul

Nov 12 '05 #1
7 8963
A long time ago, in a galaxy far, far away, pa********@clockltd.com (Paul Serby) wrote:
Why does 'select count(id) from "tblContacts"' do a sequential scan
when the field 'id' is indexed using a btree? MySql simply looks at
the index which is keeping a handy record of the number of rows.
Can anybody explain how and why postgres does this query like it
does?


Look into the semantics of MVCC (MultiVersion Concurrency Control);
that (otherwise useful) feature prevents having any such "handy
record."
--
let name="cbbrowne" and tld="cbbrowne.com" in String.concat "@" [name;tld];;
http://www.ntlug.org/~cbbrowne/spreadsheets.html
Developmental Psychology
"Schoolyard behavior resembles adult primate behavior because "Ontogeny
Recapitulates Phylogeny" doesn't stop at birth."
-- Mark Miller
Nov 12 '05 #2
On Tue, 7 Oct 2003, Paul Serby wrote:
Why does '*select count(id) from "tblContacts"'* do a sequential scan
when the field '*id*' is indexed using a btree?

MySql simply looks at the index which is keeping a handy record of the
number of rows.

Can anybody explain how and why postgres does this query like it does?


It's a FAQ I believe.

MySQL can tell you from it's index because it doesn't care if it gives you the
right number or not.
--
Nigel Andrews

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 12 '05 #3
On Tue, 7 Oct 2003, Paul Serby wrote:
Why does '*select count(id) from "tblContacts"'* do a sequential scan
when the field '*id*' is indexed using a btree?

MySql simply looks at the index which is keeping a handy record of the
number of rows.

Can anybody explain how and why postgres does this query like it does?


Because the index doesn't contain enough information to determine if a
particular row is visible to your transaction or not. It would have to go
read the table to find that out, at which point using the index doesn't
help. There's been a recent discussion of this on one of the lists
(either -general or -performance I'd guess) that you might want to look up
in the archives.

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 12 '05 #4
> MySQL can tell you from it's index because it doesn't care if it gives you the
right number or not.


Under what circumstances would MySQL give the wrong number?
--
Mike Nolan

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 12 '05 #5
A long time ago, in a galaxy far, far away, no***@celery.tssi.com wrote:
MySQL can tell you from it's index because it doesn't care if it gives you the
right number or not.


Under what circumstances would MySQL give the wrong number?


It would give the wrong number under _every_ circumstance where there
are uncommitted INSERTs or DELETEs.
--
select 'cbbrowne' || '@' || 'ntlug.org';
http://www3.sympatico.ca/cbbrowne/sap.html
Appendium to the Rules of the Evil Overlord #1: "I will not build
excessively integrated security-and-HVAC systems. They may be Really
Cool, but are far too vulnerable to breakdowns."
Nov 12 '05 #6
Christopher Browne wrote:
A long time ago, in a galaxy far, far away, no***@celery.tssi.com wrote:
MySQL can tell you from it's index because it doesn't care if it gives you the
right number or not.


Under what circumstances would MySQL give the wrong number?

It would give the wrong number under _every_ circumstance where there
are uncommitted INSERTs or DELETEs.


Give them some credit. I just double checked:

Using mysql 4.0.14 + innodb and transactions,

select count(*) from foo;

does not count uncommited INSERTs.

Heck, even using myisam, mysql's count(*)'s still accurate, since all
INSERTs, etc are autocommitted.

--
Linux homer 2.4.18-14 #1 Wed Sep 4 13:35:50 EDT 2002 i686 i686 i386
GNU/Linux
12:00pm up 287 days, 3:33, 7 users, load average: 6.93, 6.31, 6.16

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQE/hOFWNYbTUIgzwfARAgl0AKCo9bW9anPN6ZUYw0sA5KyzAQDZZw Cgy/tY
/eZY6YK5bhWyljVN9N5dZzY=
=kWAM
-----END PGP SIGNATURE-----

Nov 12 '05 #7

Ang Chin Han <an***@bytecraft.com.my> writes:
Heck, even using myisam, mysql's count(*)'s still accurate, since all INSERTs,
etc are autocommitted.


That's sort of true, but not the whole story. Even autocommitted transactions
can be pending for a significant amount of time. The reason it's accurate is
because with mysql isam tables all updates take a table level lock. So there's
never a chance to select the count while an uncommitted transaction is
pending, even if the update takes a long time.

This is simple and efficient when you have low levels of concurrency. But when
you have 4+ processors or transactions involving lots of disk i/o it kills
scalability.

I'm curious how it's implemented with innodb tables. Do they still take a
table-level lock when committing to update the counters? What happens to
transactions that have already started, do they see the new value?

Actually it occurs to me that that might be ok for read-committed. Is there
ever a situation where a count(*) needs to represent an old snapshot in
read-committed? It has to for long-running selects, but if the count(*) itself
is always fast that need should never arise, just shared-lock and read the
value and unlock.

In order words, imagine if you had every transaction keep a net delta of rows
for every table and at commit time locked the entire table and updated the
count. The lock would be a point of contention but it would be very fast since
it would only have to update an integer with a precalculated adjustment. In
read-committed mode that would always be a valid value. (The transaction would
have to apply its own deltas I guess.)

--
greg
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 12 '05 #8

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

23 posts views Thread by ian justice | last post: by
3 posts views Thread by Rabun | last post: by
2 posts views Thread by Simon Harvey | last post: by
reply views Thread by leo001 | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.