By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
432,650 Members | 1,778 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 432,650 IT Pros & Developers. It's quick & easy.

performance problem aftrer update from 7.1 to 7.4.2

P: n/a
Hello everyone,

i just upgraded my old postgres-database from version 7.1 to 7.4.2.
i dumped out my 7.1 database (with pg_dump from 7.1) as an sql-file with
copy-commands and to one file using insert-statements.

after initalizing and starting postgres 7.4 on a different port and
datadirectory, i tried to import the sql-dump with the copy statements.
this import fails, but importing the dump-file with inserts took a long
time but was successfully.
ok, at first i do a vacuum and analyze and fire up the database again.
now i changed my php-scripts to use the new database as datasource.

ok, things look good, but as i was testing some php-sciripts, i
recognized that the querys took about 2 or 5 times longer :(
a test a script which reads a lot of stuff from database, normaly with
pgsql 7.1 it tooks about 4 seconds to display the data, but with
postgres 7.4 it tooks about 25 seconds.
i start the 7.4 pgsql the same way as 7.1 with postmaster -D
/xxx/xxx/... -N512 -S -F -B2048 -i
i also tried to kill some indexes and recreate them, but this doesnt
matter and the query took the same long time to execute. i also tested
some simple select querys, the will take longer time.

if somebody could help me, and hints on how i can speed up my 'new'
databse would be very nice :)

thanks in advance

yours
volker
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #1
Share this Question
Share on Google+
20 Replies


P: n/a
Hello,

You are probably missing a step some where... I know you got your data
imported but you might try
using the pg_dump from 7.4.2 to grab the 7.1 database and import from
there. It sounds to me like you
are missing an index or something.

Also it would help if we new your data structure, if you could post an
explain from 7.1 and from 7.4 and
possibly the queries that you are running your tests against.

Sincerely,

Joshua D. Drake
Development - multi.art.studio wrote:
Hello everyone,

i just upgraded my old postgres-database from version 7.1 to 7.4.2.
i dumped out my 7.1 database (with pg_dump from 7.1) as an sql-file
with copy-commands and to one file using insert-statements.

after initalizing and starting postgres 7.4 on a different port and
datadirectory, i tried to import the sql-dump with the copy statements.
this import fails, but importing the dump-file with inserts took a
long time but was successfully.
ok, at first i do a vacuum and analyze and fire up the database again.
now i changed my php-scripts to use the new database as datasource.

ok, things look good, but as i was testing some php-sciripts, i
recognized that the querys took about 2 or 5 times longer :(
a test a script which reads a lot of stuff from database, normaly with
pgsql 7.1 it tooks about 4 seconds to display the data, but with
postgres 7.4 it tooks about 25 seconds.
i start the 7.4 pgsql the same way as 7.1 with postmaster -D
/xxx/xxx/... -N512 -S -F -B2048 -i
i also tried to kill some indexes and recreate them, but this doesnt
matter and the query took the same long time to execute. i also tested
some simple select querys, the will take longer time.

if somebody could help me, and hints on how i can speed up my 'new'
databse would be very nice :)

thanks in advance

yours
volker
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly


--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com
PostgreSQL Replicator -- production quality replication for PostgreSQL
---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 23 '05 #2

P: n/a
hello again,

thanks for help :)

i re-dumped the database (called mcms09) from postgres 7.1 with the
pg_dump command from 7.4.2 successfully and restored it to 7.4.2
(database called mcms, postgres 7.4.2 running on a different port)

i compared some querys from 7.1 to 7.4.2 and put all the output of the
explain command to a website, because i think its too much for posting
it here.
i just see that most time is spent in sorting the tables before (?)
creating the filter.

7.1:
mcms09=> explain select * from newsletter where site_id='m000000-970' order by date desc,id desc limit 10;
NOTICE: QUERY PLAN:
Limit (cost=9.26..9.26 rows=7 width=84)
-> Sort (cost=9.26..9.26 rows=8 width=84)
-> Index Scan using site_id_newsletter_key on newsletter (cost=0.00..9.14 rows=8 width=84)
EXPLAIN
7.4.2:
mcms=# explain select * from newsletter where site_id='m000000-970' order by date desc,id desc limit 10;
QUERY PLAN
---------------------------------------------------------------------------------------------------------
Limit (cost=17.78..17.81 rows=10 width=610)
-> Sort (cost=17.78..17.81 rows=11 width=610)
Sort Key: date, id
-> Index Scan using site_id_newsletter_key on newsletter (cost=0.00..17.59 rows=11 width=610)
Index Cond: ((site_id)::text = 'm000000-970'::text)
(5 rows)

but this query is the first one, done only once, then another subset of querys follow 10 times , the others depend on this first one, in my opinion this cant be the big factor which slows down the dbs
i also added new indexes to both databases on news, newsletter and
newsletter_send table, speeding up the query by factor 10 :-) but 7.1 is
always faster ....

you can see all the stuff , query, table layout and explains on the
website http://www.erdtrabant.de/index.php?i=60500

thanks in advance
volker

Joshua D. Drake wrote:
Hello,

You are probably missing a step some where... I know you got your data
imported but you might try
using the pg_dump from 7.4.2 to grab the 7.1 database and import from
there. It sounds to me like you
are missing an index or something.

Also it would help if we new your data structure, if you could post an
explain from 7.1 and from 7.4 and
possibly the queries that you are running your tests against.

Sincerely,

Joshua D. Drake
Development - multi.art.studio wrote:
Hello everyone,

i just upgraded my old postgres-database from version 7.1 to 7.4.2.
i dumped out my 7.1 database (with pg_dump from 7.1) as an sql-file
with copy-commands and to one file using insert-statements.

after initalizing and starting postgres 7.4 on a different port and
datadirectory, i tried to import the sql-dump with the copy statements.
this import fails, but importing the dump-file with inserts took a
long time but was successfully.
ok, at first i do a vacuum and analyze and fire up the database again.
now i changed my php-scripts to use the new database as datasource.

ok, things look good, but as i was testing some php-sciripts, i
recognized that the querys took about 2 or 5 times longer :(
a test a script which reads a lot of stuff from database, normaly
with pgsql 7.1 it tooks about 4 seconds to display the data, but with
postgres 7.4 it tooks about 25 seconds.
i start the 7.4 pgsql the same way as 7.1 with postmaster -D
/xxx/xxx/... -N512 -S -F -B2048 -i
i also tried to kill some indexes and recreate them, but this doesnt
matter and the query took the same long time to execute. i also
tested some simple select querys, the will take longer time.

if somebody could help me, and hints on how i can speed up my 'new'
databse would be very nice :)

thanks in advance

yours
volker
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly


Nov 23 '05 #3

P: n/a
On Wednesday 14 April 2004 01:56, Development - multi.art.studio wrote:

i compared some querys from 7.1 to 7.4.2 and put all the output of the
explain command to a website, because i think its too much for posting
it here.
Something funny here - notice the width of newsletter (84) in the old version
against the width in the new one (610).
7.1:
mcms09=> explain select * from newsletter where site_id='m000000-970' order
by date desc,id desc limit 10; NOTICE: QUERY PLAN:
Limit (cost=9.26..9.26 rows=7 width=84) 7.4.2:
mcms=# explain select * from newsletter where site_id='m000000-970' order
by date desc,id desc limit 10; QUERY PLAN
---------------------------------------------------------------------------
------------------------------ Limit (cost=17.78..17.81 rows=10 width=610)
-> Sort (cost=17.78..17.81 rows=11 width=610)


--
Richard Huxton
Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #4

P: n/a
Hello, and thanks for reading

Richard Huxton wrote:
Something funny here - notice the width of newsletter (84) in the old version
against the width in the new one (610).

hmmm, but why? the old mcms09 database is still growing and productive,
and 7.4 is not updated and only for testing at this time.
as i do the tests, both database contents where the same, but the old
was still used by other users.
i do some additional tests, and a simple query explain in 7.4 gives
much bigger values than in 7.1,
i tried vacuum, but nothing deleteted because everything grows....and
only sometimes one
or two entries are deleted....,
last but not least i also reindex the new database, but indexes where
all ok, and there are no problems with that (not looking on the final db
design)

what happens that 7.4 will took so much time, and width for its querys?
(really, i read that pg74 would speed up 2 or 3 times....if good db
design) i didnt change anything on the tables.

thanks for any help,
volker

ps:mcms09 will be availablke for download in a few weeks, using pgsql as
a backend :)
7.1:
mcms09=> explain select * from newsletter where site_id='m000000-970' order
by date desc,id desc limit 10; NOTICE: QUERY PLAN:
Limit (cost=9.26..9.26 rows=7 width=84)

7.4.2:
mcms=# explain select * from newsletter where site_id='m000000-970' order
by date desc,id desc limit 10; QUERY PLAN
---------------------------------------------------------------------------
------------------------------ Limit (cost=17.78..17.81 rows=10 width=610)
-> Sort (cost=17.78..17.81 rows=11 width=610)


Nov 23 '05 #5

P: n/a
On Wednesday 14 April 2004 11:08, Development - multi.art.studio wrote:
Richard Huxton wrote:
Something funny here - notice the width of newsletter (84) in the old
version against the width in the new one (610).
hmmm, but why? the old mcms09 database is still growing and productive,
and 7.4 is not updated and only for testing at this time.

what happens that 7.4 will took so much time, and width for its querys?
(really, i read that pg74 would speed up 2 or 3 times....if good db
design) i didnt change anything on the tables.


I'm not saying the change in width is your problem here, but it's odd. Compare
a pg_dump of the schema of 7.1 newsletter vs 7.4 newsletter. There are only
two possibilities:
1. PG has changed the way it reports row width (I don't remember any such
change).
2. Something *has* changed.

If it still doesn't make sense, can you post the CREATE TABLE/INDEX etc. for
the newsletter table and I'll try it here and see what figure I get.

Once we've cleared this up, we'll have a look at what your config settings are
and whether they need tuning. PS - EXPLAIN ANALYSE is better than just
EXPLAIN for seeing what happens, it actually runs the query and shows what
happened.
--
Richard Huxton
Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 23 '05 #6

P: n/a
Richard Huxton <de*@archonet.com> writes:
1. PG has changed the way it reports row width (I don't remember any such
change).


My recollection is that up till 7.2, the estimation of widths for
variable-width columns was completely bogus. Since 7.2 it's driven by
an actual average width for the column as measured by ANALYZE. So if
the query is selecting some fairly wide variable-width columns then it's
entirely likely for the width estimate to take a big jump.

Given that we haven't seen any EXPLAIN ANALYZE output it's hard to say
anything about what the *real* problem is ...

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 23 '05 #7

P: n/a
On Thursday 15 April 2004 17:30, Development - multi.art.studio wrote:
Hello,

sorry im late, but here are more details:

im wondering why 7.4 doesnt use the newsletter_site_id_date_idx-index

explain with 7.1: without analyze
mcms09=> explain select * from newsletter where site_id='m000000-970' order
by date desc,id desc limit 10; NOTICE: QUERY PLAN:
Limit (cost=9.26..9.26 rows=7 width=84)
-> Sort (cost=9.26..9.26 rows=8 width=84)
-> Index Scan using newsletter_site_id_date_idx on newsletter
(cost=0.00..9.14 rows=8 width=84) EXPLAIN
__________________________
and with pgsql 7.4.2:

mcms=# explain analyse select * from newsletter where site_id='m000000-970'
order by date desc,id desc limit 10; Limit (cost=17.78..17.81 rows=10 width=610) (actual time=0.625..0.645
rows=10 loops=1)
-> Sort (cost=17.78..17.81 rows=11 width=610) (actual
time=0.620..0.627 rows=10 loops=1)
Sort Key: date, id
-> Index Scan using site_id_newsletter_key on newsletter
(cost=0.00..17.59 rows=11 width=610) (actual time=0.087..0.286 rows=15
loops=1)
Index Cond: ((site_id)::text = 'm000000-970'::text)

Total runtime: 0.766 ms


It's not using the date index because it's using the id index - there's only
10 matches, so that looks like a good choice to me. It takes less than 1ms,
so I'm not sure this is a good example of a problem.

--
Richard Huxton
Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 23 '05 #8

P: n/a
Hello,

sorry im late, but here are more details:

im wondering why 7.4 doesnt use the newsletter_site_id_date_idx-index

explain with 7.1: without analyze
mcms09=> explain select * from newsletter where site_id='m000000-970' order by date desc,id desc limit 10;
NOTICE: QUERY PLAN:
Limit (cost=9.26..9.26 rows=7 width=84)
-> Sort (cost=9.26..9.26 rows=8 width=84)
-> Index Scan using newsletter_site_id_date_idx on newsletter (cost=0.00..9.14 rows=8 width=84)
EXPLAIN
__________________________
and with pgsql 7.4.2:

mcms=# explain analyse select * from newsletter where site_id='m000000-970' order by date desc,id desc limit 10;

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------

Limit (cost=17.78..17.81 rows=10 width=610) (actual time=0.625..0.645 rows=10 loops=1)

-> Sort (cost=17.78..17.81 rows=11 width=610) (actual time=0.620..0.627 rows=10 loops=1)

Sort Key: date, id

-> Index Scan using site_id_newsletter_key on newsletter (cost=0.00..17.59 rows=11 width=610) (actual time=0.087..0.286 rows=15 loops=1)

Index Cond: ((site_id)::text = 'm000000-970'::text)

Total runtime: 0.766 ms

(6 rows)

_____________________________

i also dumped both db-structures, with pg_dump from 7.4.2
(i also used for dumping out old 7.1 before importing to 7.4.2,
i also tried pg_dump from 7.1 and restored the db to 7.4, but performace was the same)

from old 7.1 dumped with pg_dump from 7.4.2:

CREATE SEQUENCE newsletter_id_seq
INCREMENT BY 1
MAXVALUE 2147483647
NO MINVALUE
CACHE 1;

CREATE TABLE newsletter (
id integer DEFAULT nextval('"newsletter_id_seq"'::text) NOT NULL,
site_id character varying,
date character varying,
title character varying,
text text,
aktiv smallint DEFAULT 1,
online smallint DEFAULT 1,
subtitle character varying,
show_titles smallint,
show_headline smallint,
bgcolor character varying
);

CREATE UNIQUE INDEX newsletter_id_key ON newsletter USING btree (id int4_ops);
CREATE INDEX aktiv_newsletter_key ON newsletter USING btree (aktiv int2_ops);
CREATE INDEX site_id_newsletter_key ON newsletter USING btree (site_id varchar_ops);
CREATE INDEX date_newsletter_key ON newsletter USING btree (date varchar_ops);
CREATE INDEX online_newsletter_key ON newsletter USING btree (online int2_ops);
CREATE INDEX newsletter_site_id_date_idx ON newsletter USING btree (site_id varchar_ops, date varchar_ops);

mcms09=> \d newsletter
Table "newsletter"
Attribute | Type | Modifier
---------------+-------------------+-------------------------------------------------------
id | integer | not null default nextval('"newsletter_id_seq"'::text)
site_id | character varying |
date | character varying |
title | character varying |
text | text |
aktiv | smallint | default 1
online | smallint | default 1
subtitle | character varying |
show_titles | smallint |
show_headline | smallint |
bgcolor | character varying |
Indices: aktiv_newsletter_key,
date_newsletter_key,
newsletter_id_key,
newsletter_site_id_date_idx,
online_newsletter_key,
site_id_newsletter_key

____________________

and structure from pgsql 7.4.2:

CREATE SEQUENCE newsletter_id_seq

INCREMENT BY 1
MAXVALUE 2147483647
NO MINVALUE
CACHE 1;

CREATE TABLE newsletter (
id integer DEFAULT nextval('"newsletter_id_seq"'::text) NOT NULL,
site_id character varying,
date character varying,
title character varying,
text text,
aktiv smallint DEFAULT 1,
online smallint DEFAULT 1,
subtitle character varying,
show_titles smallint,
show_headline smallint,
bgcolor character varying
);

CREATE UNIQUE INDEX newsletter_id_key ON newsletter USING btree (id);
CREATE INDEX site_id_newsletter_key ON newsletter USING btree (site_id);
CREATE INDEX date_newsletter_key ON newsletter USING btree (date);
CREATE INDEX online_newsletter_key ON newsletter USING btree (online);
CREATE INDEX newsleter_date_idx ON newsletter USING btree (date);
CREATE INDEX newsletter_site_id_date_idx ON newsletter USING btree (site_id varchar_ops, date varchar_ops);

mcms=# \d newsletter
Table "public.newsletter"
Column | Type | Modifiers
---------------+-------------------+-------------------------------------------------------
id | integer | not null default nextval('"newsletter_id_seq"'::text)
site_id | character varying |
date | character varying |
title | character varying |
text | text |
aktiv | smallint | default 1
online | smallint | default 1
subtitle | character varying |
show_titles | smallint |
show_headline | smallint |
bgcolor | character varying |
Indexes:
"newsletter_id_key" unique, btree (id)
"aktiv_newsletter_key" btree (aktiv)
"date_newsletter_key" btree (date)
"newsleter_date_idx" btree (date)
"newsletter_site_id_date_idx" btree (site_id, date)
"online_newsletter_key" btree (online)
"site_id_newsletter_key" btree (site_id)
i tried also creation of index on date only in 7.4, but this does not change anything.
hmmm

special thanks for reading and all comments :)
yours sincerely,
volker

Tom Lane wrote:
Richard Huxton <de*@archonet.com> writes:

1. PG has changed the way it reports row width (I don't remember any such
change).


My recollection is that up till 7.2, the estimation of widths for
variable-width columns was completely bogus. Since 7.2 it's driven by
an actual average width for the column as measured by ANALYZE. So if
the query is selecting some fairly wide variable-width columns then it's
entirely likely for the width estimate to take a big jump.

Given that we haven't seen any EXPLAIN ANALYZE output it's hard to say
anything about what the *real* problem is ...

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend


Nov 23 '05 #9

P: n/a
hello everyone,

Richard Huxton wrote:
It's not using the date index because it's using the id index - there's only
10 matches, so that looks like a good choice to me. It takes less than 1ms,
so I'm not sure this is a good example of a problem.
thanks all !
but this doesnt help me a lot, i dont know what what to do? im not an expert.
ok, i could use another site_id and it will grow up and gives 500 or more rows back from the select clause.
this takes a lot of more time.
would this help?

for me it looks as pg742 is not as performant as 7.1 ? and takes a lot of more time, and doesnt look for the correct index? but why this?
why does it take longer than in 7.1? any ideas? what can i do?

any help would be greatly apreciated.

yours volker
sory for the bad english

Richard Huxton wrote:
On Thursday 15 April 2004 17:30, Development - multi.art.studio wrote:

Hello,

sorry im late, but here are more details:

im wondering why 7.4 doesnt use the newsletter_site_id_date_idx-index

explain with 7.1: without analyze
mcms09=> explain select * from newsletter where site_id='m000000-970' order
by date desc,id desc limit 10; NOTICE: QUERY PLAN:
Limit (cost=9.26..9.26 rows=7 width=84)
-> Sort (cost=9.26..9.26 rows=8 width=84)
-> Index Scan using newsletter_site_id_date_idx on newsletter
(cost=0.00..9.14 rows=8 width=84) EXPLAIN
__________________________
and with pgsql 7.4.2:

mcms=# explain analyse select * from newsletter where site_id='m000000-970'
order by date desc,id desc limit 10;

Limit (cost=17.78..17.81 rows=10 width=610) (actual time=0.625..0.645
rows=10 loops=1)
-> Sort (cost=17.78..17.81 rows=11 width=610) (actual
time=0.620..0.627 rows=10 loops=1)
Sort Key: date, id
-> Index Scan using site_id_newsletter_key on newsletter
(cost=0.00..17.59 rows=11 width=610) (actual time=0.087..0.286 rows=15
loops=1)
Index Cond: ((site_id)::text = 'm000000-970'::text)

Total runtime: 0.766 ms


It's not using the date index because it's using the id index - there's only
10 matches, so that looks like a good choice to me. It takes less than 1ms,
so I'm not sure this is a good example of a problem.

Nov 23 '05 #10

P: n/a
On Saturday 17 April 2004 13:23, Development - multi.art.studio wrote:
hello everyone,

Richard Huxton wrote:
It's not using the date index because it's using the id index - there's
only 10 matches, so that looks like a good choice to me. It takes less
than 1ms, so I'm not sure this is a good example of a problem.
thanks all !
but this doesnt help me a lot, i dont know what what to do? im not an
expert. ok, i could use another site_id and it will grow up and gives 500
or more rows back from the select clause. this takes a lot of more time.
would this help?


If that is the problem, that's what you'll need to post.
for me it looks as pg742 is not as performant as 7.1 ? and takes a lot of
more time, and doesnt look for the correct index? but why this? why does it
take longer than in 7.1? any ideas? what can i do?


It's not something people are generally seeing. In most cases performance is
the same or slightly better. For some queries it can be a lot better.

What I suggest:
1. Compare the two postgresql.conf files and any other config settings and
make sure you know what differences there are and why.
2. Identify what queries seem to be the cause of the problem, and pick one you
think is a good example.
3. VACCUM FULL and ANALYZE both 7.1 and 7.4 databases.
4. Stop the 7.4 server and run EXPLAIN ANALYSE (from psql) for 7.1. If you
want to make sure the data is cached, run it three times and use the last
one.
While you're running the explain, execute "vmstat 1 > vmstat.trace-7.1.txt" in
another terminal.
5. Stop the 7.1 server, restart 7.4 and run EXPLAIN ANALYSE (from psql) for
7.4, same as before, verify that the same results are returned. Same as
before for the caching.
Trace using vmstat for this too.

Post the query SQL and both EXPLAIN ANALYSE outputs along with table
definitions and row counts for each table involved and we can see exactly
where the problem is. If you think this is too much info, you can post it on
the web instead, that's fine.
If you're using the vmstat info, you could post that too, otherwise just keep
it safe in case it's needed.

--
Richard Huxton
Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 23 '05 #11

P: n/a
hello,
Richard Huxton wrote:
What I suggest:
1. Compare the two postgresql.conf files and any other config settings and
make sure you know what differences there are and why.
2. Identify what queries seem to be the cause of the problem, and pick one you
think is a good example.
3. VACCUM FULL and ANALYZE both 7.1 and 7.4 databases.
4. Stop the 7.4 server and run EXPLAIN ANALYSE (from psql) for 7.1. If you
want to make sure the data is cached, run it three times and use the last
one.
While you're running the explain, execute "vmstat 1 > vmstat.trace-7.1.txt" in
another terminal.
5. Stop the 7.1 server, restart 7.4 and run EXPLAIN ANALYSE (from psql) for
7.4, same as before, verify that the same results are returned. Same as
before for the caching.
Trace using vmstat for this too.

thanks for the great hints, last days i was trying and trying......
restoring databases using a long time.
but i was surprised on the results.
ok, im not finished because i can stop the production dbs only late in
the night after telling some people...but here are some 'heavy' results

i dumped out my database with pgdump from 7.4 as before , one dump with
structure only, one with inserts
after that, doing a 'vacuum full' on 7.4 and stopping new 7.4 database.

i created a test-query
(it will show the entries in table newsletter_send (historical datas)
with links to valid newsletter and addresses always using a specific
site-id ordered by send-date, address-id and internal-id

on old production 7.1:

Query:

mcms09=> select * from newsletter_send where site_id='m200384-000' and
newsletter_id in (select id from newsletter where aktiv=1 and
site_id='m200384-000') and newsletter_adr_id in (select id from
newsletter_address where site_id='m200384-000' and aktiv=1) and sent is
not null order by sent desc,newsletter_adr_id desc,id desc;
Cancel request sent
ERROR: Query was cancelled.

------takes too long time, about 3 minutes without displaying a result ;-)

Explain:

mcms09=> explain select * from newsletter_send where
site_id='m200384-000' and newsletter_id in (select id from newsletter
where aktiv=1 and site_id='m200384-000') and newsletter_adr_id in
(select id from newsletter_address where site_id='m200384-000' and
aktiv=1) and sent is not null order by sent desc,newsletter_adr_id
desc,id desc;
NOTICE: QUERY PLAN:

Sort (cost=4224132.30..4224132.30 rows=29063 width=54)
-> Index Scan using newsletter_sent_site_id_idx on newsletter_send
(cost=0.00..4221402.52 rows=29063 width=54)
SubPlan
-> Seq Scan on newsletter_address (cost=0.00..13.61 rows=133
width=4)
-> Seq Scan on newsletter (cost=0.00..82.95 rows=841 width=4)
EXPLAIN

-------i do this querys about three or five times to be sure its cached

now i started 7.4 again and test it with fresh 7.4.2 restored database:

Query-Explain:

mcms=# explain select * from newsletter_send where site_id='m200384-000'
and newsletter_id in (select id from newsletter where aktiv=1 and
site_id='m200384-000') and newsletter_adr_id in (select id from
newsletter_address where site_id='m200384-000' and aktiv=1) and sent is
not null order by sent desc,newsletter_adr_id desc,id desc;
QUERY PLAN
----------------------------------------------------------------------------------------------
Sort (cost=6205.83..6275.90 rows=28026 width=76)
Sort Key: newsletter_send.sent, newsletter_send.newsletter_adr_id,
newsletter_send.id
-> Hash IN Join (cost=100.00..3409.13 rows=28026 width=76)
Hash Cond: ("outer".newsletter_id = "inner".id)
-> Hash IN Join (cost=13.94..2887.87 rows=30991 width=76)
Hash Cond: ("outer".newsletter_adr_id = "inner".id)
-> Seq Scan on newsletter_send (cost=0.00..2128.29
rows=87145 width=76)
Filter: (((site_id)::text = 'm200384-000'::text)
AND (sent IS NOT NULL))
-> Hash (cost=13.61..13.61 rows=133 width=4)
-> Seq Scan on newsletter_address
(cost=0.00..13.61 rows=133 width=4)
Filter: (((site_id)::text =
'm200384-000'::text) AND (aktiv = 1))
-> Hash (cost=83.95..83.95 rows=841 width=4)
-> Seq Scan on newsletter (cost=0.00..83.95 rows=841
width=4)
Filter: ((aktiv = 1) AND ((site_id)::text =
'm200384-000'::text))
(14 rows)
.....Heavy! its about 1000-times faster i think, and now postgres speeds
up like a rocket.
i dont know why this happens......, but i did another query, because my
idea is there are problems in sorting the table in a very simple
query....hmmmm, if im not totally wrong ;-) tell me *g
________________________________

ok here are some more tests

Query 7.1:
mcms09=> explain select * from newsletter where aktiv=1 and
site_id='m200384_000' order by id desc,date desc;
NOTICE: QUERY PLAN:
Sort (cost=9.29..9.29 rows=8 width=84)
-> Index Scan using newsletter_site_id_date_idx on newsletter
(cost=0.00..9.17 rows=8 width=84)
EXPLAIN

Query 7.4.2:
mcms=# explain select * from newsletter where aktiv=1 and
site_id='m200384_000' order by id desc,date desc;
QUERY PLAN
---------------------------------------------------------------------------------------------------
Sort (cost=18.67..18.70 rows=11 width=598)
Sort Key: id, date
-> Index Scan using site_id_newsletter_key on newsletter
(cost=0.00..18.48 rows=11 width=598)
Index Cond: ((site_id)::text = 'm200384_000'::text)
Filter: (aktiv = 1)
(5 rows)

now it costs double time to query the table.
Explain show me it sorts the table up by id and date, after that it
scans and uses filter.......is this the order postgres works? maybe
sorting the table without filters before querying it costs a lot of
performance, only result should be sorted.
hmmm, i will start it with higher debuglevel and try to get out.
Post the query SQL and both EXPLAIN ANALYSE outputs along with table
definitions and row counts for each table involved and we can see exactly
where the problem is. If you think this is too much info, you can post it on
the web instead, that's fine.
If you're using the vmstat info, you could post that too, otherwise just keep
it safe in case it's needed.

thanks in advance, i updated http://www.erdtrabant.de/index.php?p=&l=de&i=60500

yours sincerely
volker

Richard Huxton wrote:
On Saturday 17 April 2004 13:23, Development - multi.art.studio wrote:

hello everyone,

Richard Huxton wrote:

It's not using the date index because it's using the id index - there's
only 10 matches, so that looks like a good choice to me. It takes less
than 1ms, so I'm not sure this is a good example of a problem.

thanks all !
but this doesnt help me a lot, i dont know what what to do? im not an
expert. ok, i could use another site_id and it will grow up and gives 500
or more rows back from the select clause. this takes a lot of more time.
would this help?


If that is the problem, that's what you'll need to post.
for me it looks as pg742 is not as performant as 7.1 ? and takes a lot of
more time, and doesnt look for the correct index? but why this? why does it
take longer than in 7.1? any ideas? what can i do?


It's not something people are generally seeing. In most cases performance is
the same or slightly better. For some queries it can be a lot better.

What I suggest:
1. Compare the two postgresql.conf files and any other config settings and
make sure you know what differences there are and why.
2. Identify what queries seem to be the cause of the problem, and pick one you
think is a good example.
3. VACCUM FULL and ANALYZE both 7.1 and 7.4 databases.
4. Stop the 7.4 server and run EXPLAIN ANALYSE (from psql) for 7.1. If you
want to make sure the data is cached, run it three times and use the last
one.
While you're running the explain, execute "vmstat 1 > vmstat.trace-7.1.txt" in
another terminal.
5. Stop the 7.1 server, restart 7.4 and run EXPLAIN ANALYSE (from psql) for
7.4, same as before, verify that the same results are returned. Same as
before for the caching.
Trace using vmstat for this too.

Post the query SQL and both EXPLAIN ANALYSE outputs along with table
definitions and row counts for each table involved and we can see exactly
where the problem is. If you think this is too much info, you can post it on
the web instead, that's fine.
If you're using the vmstat info, you could post that too, otherwise just keep
it safe in case it's needed.

Nov 23 '05 #12

P: n/a
hello,
Richard Huxton wrote:
What I suggest:
1. Compare the two postgresql.conf files and any other config settings and
make sure you know what differences there are and why.
2. Identify what queries seem to be the cause of the problem, and pick one you
think is a good example.
3. VACCUM FULL and ANALYZE both 7.1 and 7.4 databases.
4. Stop the 7.4 server and run EXPLAIN ANALYSE (from psql) for 7.1. If you
want to make sure the data is cached, run it three times and use the last
one.
While you're running the explain, execute "vmstat 1 > vmstat.trace-7.1.txt" in
another terminal.
5. Stop the 7.1 server, restart 7.4 and run EXPLAIN ANALYSE (from psql) for
7.4, same as before, verify that the same results are returned. Same as
before for the caching.
Trace using vmstat for this too.

thanks for the great hints, last days i was trying and trying......
restoring databases using a long time.
but i was surprised on the results.
ok, im not finished because i can stop the production dbs only late in
the night after telling some people...but here are some 'heavy' results

i dumped out my database with pgdump from 7.4 as before , one dump with
structure only, one with inserts
after that, doing a 'vacuum full' on 7.4 and stopping new 7.4 database.

i created a test-query
(it will show the entries in table newsletter_send (historical datas)
with links to valid newsletter and addresses always using a specific
site-id ordered by send-date, address-id and internal-id

on old production 7.1:

Query:

mcms09=> select * from newsletter_send where site_id='m200384-000' and
newsletter_id in (select id from newsletter where aktiv=1 and
site_id='m200384-000') and newsletter_adr_id in (select id from
newsletter_address where site_id='m200384-000' and aktiv=1) and sent is
not null order by sent desc,newsletter_adr_id desc,id desc;
Cancel request sent
ERROR: Query was cancelled.

------takes too long time, about 3 minutes without displaying a result ;-)

Explain:

mcms09=> explain select * from newsletter_send where
site_id='m200384-000' and newsletter_id in (select id from newsletter
where aktiv=1 and site_id='m200384-000') and newsletter_adr_id in
(select id from newsletter_address where site_id='m200384-000' and
aktiv=1) and sent is not null order by sent desc,newsletter_adr_id
desc,id desc;
NOTICE: QUERY PLAN:

Sort (cost=4224132.30..4224132.30 rows=29063 width=54)
-> Index Scan using newsletter_sent_site_id_idx on newsletter_send
(cost=0.00..4221402.52 rows=29063 width=54)
SubPlan
-> Seq Scan on newsletter_address (cost=0.00..13.61 rows=133
width=4)
-> Seq Scan on newsletter (cost=0.00..82.95 rows=841 width=4)
EXPLAIN

-------i do this querys about three or five times to be sure its cached

now i started 7.4 again and test it with fresh 7.4.2 restored database:

Query-Explain:

mcms=# explain select * from newsletter_send where site_id='m200384-000'
and newsletter_id in (select id from newsletter where aktiv=1 and
site_id='m200384-000') and newsletter_adr_id in (select id from
newsletter_address where site_id='m200384-000' and aktiv=1) and sent is
not null order by sent desc,newsletter_adr_id desc,id desc;
QUERY PLAN
----------------------------------------------------------------------------------------------
Sort (cost=6205.83..6275.90 rows=28026 width=76)
Sort Key: newsletter_send.sent, newsletter_send.newsletter_adr_id,
newsletter_send.id
-> Hash IN Join (cost=100.00..3409.13 rows=28026 width=76)
Hash Cond: ("outer".newsletter_id = "inner".id)
-> Hash IN Join (cost=13.94..2887.87 rows=30991 width=76)
Hash Cond: ("outer".newsletter_adr_id = "inner".id)
-> Seq Scan on newsletter_send (cost=0.00..2128.29
rows=87145 width=76)
Filter: (((site_id)::text = 'm200384-000'::text)
AND (sent IS NOT NULL))
-> Hash (cost=13.61..13.61 rows=133 width=4)
-> Seq Scan on newsletter_address
(cost=0.00..13.61 rows=133 width=4)
Filter: (((site_id)::text =
'm200384-000'::text) AND (aktiv = 1))
-> Hash (cost=83.95..83.95 rows=841 width=4)
-> Seq Scan on newsletter (cost=0.00..83.95 rows=841
width=4)
Filter: ((aktiv = 1) AND ((site_id)::text =
'm200384-000'::text))
(14 rows)
.....Heavy! its about 1000-times faster i think, and now postgres speeds
up like a rocket.
i dont know why this happens......, but i did another query, because my
idea is there are problems in sorting the table in a very simple
query....hmmmm, if im not totally wrong ;-) tell me *g
________________________________

ok here are some more tests

Query 7.1:
mcms09=> explain select * from newsletter where aktiv=1 and
site_id='m200384_000' order by id desc,date desc;
NOTICE: QUERY PLAN:
Sort (cost=9.29..9.29 rows=8 width=84)
-> Index Scan using newsletter_site_id_date_idx on newsletter
(cost=0.00..9.17 rows=8 width=84)
EXPLAIN

Query 7.4.2:
mcms=# explain select * from newsletter where aktiv=1 and
site_id='m200384_000' order by id desc,date desc;
QUERY PLAN
---------------------------------------------------------------------------------------------------
Sort (cost=18.67..18.70 rows=11 width=598)
Sort Key: id, date
-> Index Scan using site_id_newsletter_key on newsletter
(cost=0.00..18.48 rows=11 width=598)
Index Cond: ((site_id)::text = 'm200384_000'::text)
Filter: (aktiv = 1)
(5 rows)

now it costs double time to query the table.
Explain show me it sorts the table up by id and date, after that it
scans and uses filter.......is this the order postgres works? maybe
sorting the table without filters before querying it costs a lot of
performance, only result should be sorted.
hmmm, i will start it with higher debuglevel and try to get out.
Post the query SQL and both EXPLAIN ANALYSE outputs along with table
definitions and row counts for each table involved and we can see exactly
where the problem is. If you think this is too much info, you can post it on
the web instead, that's fine.
If you're using the vmstat info, you could post that too, otherwise just keep
it safe in case it's needed.

thanks in advance, i updated http://www.erdtrabant.de/index.php?p=&l=de&i=60500

yours sincerely
volker

Richard Huxton wrote:
On Saturday 17 April 2004 13:23, Development - multi.art.studio wrote:

hello everyone,

Richard Huxton wrote:

It's not using the date index because it's using the id index - there's
only 10 matches, so that looks like a good choice to me. It takes less
than 1ms, so I'm not sure this is a good example of a problem.

thanks all !
but this doesnt help me a lot, i dont know what what to do? im not an
expert. ok, i could use another site_id and it will grow up and gives 500
or more rows back from the select clause. this takes a lot of more time.
would this help?


If that is the problem, that's what you'll need to post.
for me it looks as pg742 is not as performant as 7.1 ? and takes a lot of
more time, and doesnt look for the correct index? but why this? why does it
take longer than in 7.1? any ideas? what can i do?


It's not something people are generally seeing. In most cases performance is
the same or slightly better. For some queries it can be a lot better.

What I suggest:
1. Compare the two postgresql.conf files and any other config settings and
make sure you know what differences there are and why.
2. Identify what queries seem to be the cause of the problem, and pick one you
think is a good example.
3. VACCUM FULL and ANALYZE both 7.1 and 7.4 databases.
4. Stop the 7.4 server and run EXPLAIN ANALYSE (from psql) for 7.1. If you
want to make sure the data is cached, run it three times and use the last
one.
While you're running the explain, execute "vmstat 1 > vmstat.trace-7.1.txt" in
another terminal.
5. Stop the 7.1 server, restart 7.4 and run EXPLAIN ANALYSE (from psql) for
7.4, same as before, verify that the same results are returned. Same as
before for the caching.
Trace using vmstat for this too.

Post the query SQL and both EXPLAIN ANALYSE outputs along with table
definitions and row counts for each table involved and we can see exactly
where the problem is. If you think this is too much info, you can post it on
the web instead, that's fine.
If you're using the vmstat info, you could post that too, otherwise just keep
it safe in case it's needed.

Nov 23 '05 #13

P: n/a
hello again,

i did some mistakes, site-id was wrong in my simple select querys.
(interesting that no matches would take double time on 7.4)

here are the 'right' results, showing to me postgres 7.4 is slightly
slower with simple querys (but 1000-times faster with more complex
querys ;-)

with production 7.1:

mcms09=> select count(id) from newsletter where aktiv=1 and
site_id='m200384-000';
count
-------
845
(1 row)

---some users where working on it and added two newsletter since last
dump....

mcms09=> explain select * from newsletter where aktiv=1 and
site_id='m200384-000' order by id desc,date desc;
NOTICE: QUERY PLAN:
Sort (cost=123.78..123.78 rows=841 width=84)
-> Seq Scan on newsletter (cost=0.00..82.95 rows=841 width=84)
EXPLAIN
and 7.4 test:

mcms=# select count(id) from newsletter where aktiv=1 and
site_id='m200384-000';
count
-------
843
(1 row)

mcms=# explain select * from newsletter where aktiv=1 and
site_id='m200384-000' order by id desc,date desc;
QUERY PLAN
---------------------------------------------------------------------------
Sort (cost=124.81..126.91 rows=841 width=598)
Sort Key: id, date
-> Seq Scan on newsletter (cost=0.00..83.95 rows=841 width=598)
Filter: ((aktiv = 1) AND ((site_id)::text = 'm200384-000'::text))
(4 rows)
so in 7.4 seq-scan is about '1' (ms or what?) slower than 7.1

really thats not a very big table, my tests will go on!
if any idea, please let me know.

thanks in advance.
volker

Development - multi.art.studio wrote:
hello,
Richard Huxton wrote:
What I suggest:
1. Compare the two postgresql.conf files and any other config settings and
make sure you know what differences there are and why.
2. Identify what queries seem to be the cause of the problem, and pick one you
think is a good example.
3. VACCUM FULL and ANALYZE both 7.1 and 7.4 databases.
4. Stop the 7.4 server and run EXPLAIN ANALYSE (from psql) for 7.1. If you
want to make sure the data is cached, run it three times and use the last
one.
While you're running the explain, execute "vmstat 1 > vmstat.trace-7.1.txt" in
another terminal.
5. Stop the 7.1 server, restart 7.4 and run EXPLAIN ANALYSE (from psql) for
7.4, same as before, verify that the same results are returned. Same as
before for the caching.
Trace using vmstat for this too.


thanks for the great hints, last days i was trying and trying......
restoring databases using a long time.
but i was surprised on the results.
ok, im not finished because i can stop the production dbs only late in
the night after telling some people...but here are some 'heavy' results

i dumped out my database with pgdump from 7.4 as before , one dump
with structure only, one with inserts
after that, doing a 'vacuum full' on 7.4 and stopping new 7.4 database.

i created a test-query
(it will show the entries in table newsletter_send (historical datas)
with links to valid newsletter and addresses always using a specific
site-id ordered by send-date, address-id and internal-id

on old production 7.1:

Query:

mcms09=> select * from newsletter_send where site_id='m200384-000' and
newsletter_id in (select id from newsletter where aktiv=1 and
site_id='m200384-000') and newsletter_adr_id in (select id from
newsletter_address where site_id='m200384-000' and aktiv=1) and sent
is not null order by sent desc,newsletter_adr_id desc,id desc;
Cancel request sent
ERROR: Query was cancelled.

------takes too long time, about 3 minutes without displaying a result ;-)

Explain:

mcms09=> explain select * from newsletter_send where
site_id='m200384-000' and newsletter_id in (select id from newsletter
where aktiv=1 and site_id='m200384-000') and newsletter_adr_id in
(select id from newsletter_address where site_id='m200384-000' and
aktiv=1) and sent is not null order by sent desc,newsletter_adr_id
desc,id desc;
NOTICE: QUERY PLAN:

Sort (cost=4224132.30..4224132.30 rows=29063 width=54)
-> Index Scan using newsletter_sent_site_id_idx on newsletter_send
(cost=0.00..4221402.52 rows=29063 width=54)
SubPlan
-> Seq Scan on newsletter_address (cost=0.00..13.61
rows=133 width=4)
-> Seq Scan on newsletter (cost=0.00..82.95 rows=841 width=4)
EXPLAIN

-------i do this querys about three or five times to be sure its cached

now i started 7.4 again and test it with fresh 7.4.2 restored database:

Query-Explain:

mcms=# explain select * from newsletter_send where
site_id='m200384-000' and newsletter_id in (select id from newsletter
where aktiv=1 and site_id='m200384-000') and newsletter_adr_id in
(select id from newsletter_address where site_id='m200384-000' and
aktiv=1) and sent is not null order by sent desc,newsletter_adr_id
desc,id desc;
QUERY PLAN
----------------------------------------------------------------------------------------------
Sort (cost=6205.83..6275.90 rows=28026 width=76)
Sort Key: newsletter_send.sent, newsletter_send.newsletter_adr_id,
newsletter_send.id
-> Hash IN Join (cost=100.00..3409.13 rows=28026 width=76)
Hash Cond: ("outer".newsletter_id = "inner".id)
-> Hash IN Join (cost=13.94..2887.87 rows=30991 width=76)
Hash Cond: ("outer".newsletter_adr_id = "inner".id)
-> Seq Scan on newsletter_send (cost=0.00..2128.29
rows=87145 width=76)
Filter: (((site_id)::text = 'm200384-000'::text)
AND (sent IS NOT NULL))
-> Hash (cost=13.61..13.61 rows=133 width=4)
-> Seq Scan on newsletter_address
(cost=0.00..13.61 rows=133 width=4)
Filter: (((site_id)::text =
'm200384-000'::text) AND (aktiv = 1))
-> Hash (cost=83.95..83.95 rows=841 width=4)
-> Seq Scan on newsletter (cost=0.00..83.95 rows=841
width=4)
Filter: ((aktiv = 1) AND ((site_id)::text =
'm200384-000'::text))
(14 rows)
....Heavy! its about 1000-times faster i think, and now postgres
speeds up like a rocket.
i dont know why this happens......, but i did another query, because
my idea is there are problems in sorting the table in a very simple
query....hmmmm, if im not totally wrong ;-) tell me *g
________________________________

ok here are some more tests

Query 7.1:
mcms09=> explain select * from newsletter where aktiv=1 and
site_id='m200384_000' order by id desc,date desc;
NOTICE: QUERY PLAN:
Sort (cost=9.29..9.29 rows=8 width=84)
-> Index Scan using newsletter_site_id_date_idx on newsletter
(cost=0.00..9.17 rows=8 width=84)
EXPLAIN

Query 7.4.2:
mcms=# explain select * from newsletter where aktiv=1 and
site_id='m200384_000' order by id desc,date desc;
QUERY PLAN
---------------------------------------------------------------------------------------------------
Sort (cost=18.67..18.70 rows=11 width=598)
Sort Key: id, date
-> Index Scan using site_id_newsletter_key on newsletter
(cost=0.00..18.48 rows=11 width=598)
Index Cond: ((site_id)::text = 'm200384_000'::text)
Filter: (aktiv = 1)
(5 rows)

now it costs double time to query the table.
Explain show me it sorts the table up by id and date, after that it
scans and uses filter.......is this the order postgres works? maybe
sorting the table without filters before querying it costs a lot of
performance, only result should be sorted.
hmmm, i will start it with higher debuglevel and try to get out.
Post the query SQL and both EXPLAIN ANALYSE outputs along with table
definitions and row counts for each table involved and we can see exactly
where the problem is. If you think this is too much info, you can post it on
the web instead, that's fine.
If you're using the vmstat info, you could post that too, otherwise just keep
it safe in case it's needed.



thanks in advance, i updated http://www.erdtrabant.de/index.php?p=&l=de&i=60500

yours sincerely
volker


Richard Huxton wrote:
On Saturday 17 April 2004 13:23, Development - multi.art.studio wrote:

hello everyone,

Richard Huxton wrote:
It's not using the date index because it's using the id index - there's
only 10 matches, so that looks like a good choice to me. It takes less
than 1ms, so I'm not sure this is a good example of a problem.
thanks all !
but this doesnt help me a lot, i dont know what what to do? im not an
expert. ok, i could use another site_id and it will grow up and gives 500
or more rows back from the select clause. this takes a lot of more time.
would this help?


If that is the problem, that's what you'll need to post.
for me it looks as pg742 is not as performant as 7.1 ? and takes a lot of
more time, and doesnt look for the correct index? but why this? why does it
take longer than in 7.1? any ideas? what can i do?


It's not something people are generally seeing. In most cases performance is
the same or slightly better. For some queries it can be a lot better.

What I suggest:
1. Compare the two postgresql.conf files and any other config settings and
make sure you know what differences there are and why.
2. Identify what queries seem to be the cause of the problem, and pick one you
think is a good example.
3. VACCUM FULL and ANALYZE both 7.1 and 7.4 databases.
4. Stop the 7.4 server and run EXPLAIN ANALYSE (from psql) for 7.1. If you
want to make sure the data is cached, run it three times and use the last
one.
While you're running the explain, execute "vmstat 1 > vmstat.trace-7.1.txt" in
another terminal.
5. Stop the 7.1 server, restart 7.4 and run EXPLAIN ANALYSE (from psql) for
7.4, same as before, verify that the same results are returned. Same as
before for the caching.
Trace using vmstat for this too.

Post the query SQL and both EXPLAIN ANALYSE outputs along with table
definitions and row counts for each table involved and we can see exactly
where the problem is. If you think this is too much info, you can post it on
the web instead, that's fine.
If you're using the vmstat info, you could post that too, otherwise just keep
it safe in case it's needed.

Nov 23 '05 #14

P: n/a
hello again,

i did some mistakes, site-id was wrong in my simple select querys.
(interesting that no matches would take double time on 7.4)

here are the 'right' results, showing to me postgres 7.4 is slightly
slower with simple querys (but 1000-times faster with more complex
querys ;-)

with production 7.1:

mcms09=> select count(id) from newsletter where aktiv=1 and
site_id='m200384-000';
count
-------
845
(1 row)

---some users where working on it and added two newsletter since last
dump....

mcms09=> explain select * from newsletter where aktiv=1 and
site_id='m200384-000' order by id desc,date desc;
NOTICE: QUERY PLAN:
Sort (cost=123.78..123.78 rows=841 width=84)
-> Seq Scan on newsletter (cost=0.00..82.95 rows=841 width=84)
EXPLAIN
and 7.4 test:

mcms=# select count(id) from newsletter where aktiv=1 and
site_id='m200384-000';
count
-------
843
(1 row)

mcms=# explain select * from newsletter where aktiv=1 and
site_id='m200384-000' order by id desc,date desc;
QUERY PLAN
---------------------------------------------------------------------------
Sort (cost=124.81..126.91 rows=841 width=598)
Sort Key: id, date
-> Seq Scan on newsletter (cost=0.00..83.95 rows=841 width=598)
Filter: ((aktiv = 1) AND ((site_id)::text = 'm200384-000'::text))
(4 rows)
so in 7.4 seq-scan is about '1' (ms or what?) slower than 7.1

really thats not a very big table, my tests will go on!
if any idea, please let me know.

thanks in advance.
volker

Development - multi.art.studio wrote:
hello,
Richard Huxton wrote:
What I suggest:
1. Compare the two postgresql.conf files and any other config settings and
make sure you know what differences there are and why.
2. Identify what queries seem to be the cause of the problem, and pick one you
think is a good example.
3. VACCUM FULL and ANALYZE both 7.1 and 7.4 databases.
4. Stop the 7.4 server and run EXPLAIN ANALYSE (from psql) for 7.1. If you
want to make sure the data is cached, run it three times and use the last
one.
While you're running the explain, execute "vmstat 1 > vmstat.trace-7.1.txt" in
another terminal.
5. Stop the 7.1 server, restart 7.4 and run EXPLAIN ANALYSE (from psql) for
7.4, same as before, verify that the same results are returned. Same as
before for the caching.
Trace using vmstat for this too.


thanks for the great hints, last days i was trying and trying......
restoring databases using a long time.
but i was surprised on the results.
ok, im not finished because i can stop the production dbs only late in
the night after telling some people...but here are some 'heavy' results

i dumped out my database with pgdump from 7.4 as before , one dump
with structure only, one with inserts
after that, doing a 'vacuum full' on 7.4 and stopping new 7.4 database.

i created a test-query
(it will show the entries in table newsletter_send (historical datas)
with links to valid newsletter and addresses always using a specific
site-id ordered by send-date, address-id and internal-id

on old production 7.1:

Query:

mcms09=> select * from newsletter_send where site_id='m200384-000' and
newsletter_id in (select id from newsletter where aktiv=1 and
site_id='m200384-000') and newsletter_adr_id in (select id from
newsletter_address where site_id='m200384-000' and aktiv=1) and sent
is not null order by sent desc,newsletter_adr_id desc,id desc;
Cancel request sent
ERROR: Query was cancelled.

------takes too long time, about 3 minutes without displaying a result ;-)

Explain:

mcms09=> explain select * from newsletter_send where
site_id='m200384-000' and newsletter_id in (select id from newsletter
where aktiv=1 and site_id='m200384-000') and newsletter_adr_id in
(select id from newsletter_address where site_id='m200384-000' and
aktiv=1) and sent is not null order by sent desc,newsletter_adr_id
desc,id desc;
NOTICE: QUERY PLAN:

Sort (cost=4224132.30..4224132.30 rows=29063 width=54)
-> Index Scan using newsletter_sent_site_id_idx on newsletter_send
(cost=0.00..4221402.52 rows=29063 width=54)
SubPlan
-> Seq Scan on newsletter_address (cost=0.00..13.61
rows=133 width=4)
-> Seq Scan on newsletter (cost=0.00..82.95 rows=841 width=4)
EXPLAIN

-------i do this querys about three or five times to be sure its cached

now i started 7.4 again and test it with fresh 7.4.2 restored database:

Query-Explain:

mcms=# explain select * from newsletter_send where
site_id='m200384-000' and newsletter_id in (select id from newsletter
where aktiv=1 and site_id='m200384-000') and newsletter_adr_id in
(select id from newsletter_address where site_id='m200384-000' and
aktiv=1) and sent is not null order by sent desc,newsletter_adr_id
desc,id desc;
QUERY PLAN
----------------------------------------------------------------------------------------------
Sort (cost=6205.83..6275.90 rows=28026 width=76)
Sort Key: newsletter_send.sent, newsletter_send.newsletter_adr_id,
newsletter_send.id
-> Hash IN Join (cost=100.00..3409.13 rows=28026 width=76)
Hash Cond: ("outer".newsletter_id = "inner".id)
-> Hash IN Join (cost=13.94..2887.87 rows=30991 width=76)
Hash Cond: ("outer".newsletter_adr_id = "inner".id)
-> Seq Scan on newsletter_send (cost=0.00..2128.29
rows=87145 width=76)
Filter: (((site_id)::text = 'm200384-000'::text)
AND (sent IS NOT NULL))
-> Hash (cost=13.61..13.61 rows=133 width=4)
-> Seq Scan on newsletter_address
(cost=0.00..13.61 rows=133 width=4)
Filter: (((site_id)::text =
'm200384-000'::text) AND (aktiv = 1))
-> Hash (cost=83.95..83.95 rows=841 width=4)
-> Seq Scan on newsletter (cost=0.00..83.95 rows=841
width=4)
Filter: ((aktiv = 1) AND ((site_id)::text =
'm200384-000'::text))
(14 rows)
....Heavy! its about 1000-times faster i think, and now postgres
speeds up like a rocket.
i dont know why this happens......, but i did another query, because
my idea is there are problems in sorting the table in a very simple
query....hmmmm, if im not totally wrong ;-) tell me *g
________________________________

ok here are some more tests

Query 7.1:
mcms09=> explain select * from newsletter where aktiv=1 and
site_id='m200384_000' order by id desc,date desc;
NOTICE: QUERY PLAN:
Sort (cost=9.29..9.29 rows=8 width=84)
-> Index Scan using newsletter_site_id_date_idx on newsletter
(cost=0.00..9.17 rows=8 width=84)
EXPLAIN

Query 7.4.2:
mcms=# explain select * from newsletter where aktiv=1 and
site_id='m200384_000' order by id desc,date desc;
QUERY PLAN
---------------------------------------------------------------------------------------------------
Sort (cost=18.67..18.70 rows=11 width=598)
Sort Key: id, date
-> Index Scan using site_id_newsletter_key on newsletter
(cost=0.00..18.48 rows=11 width=598)
Index Cond: ((site_id)::text = 'm200384_000'::text)
Filter: (aktiv = 1)
(5 rows)

now it costs double time to query the table.
Explain show me it sorts the table up by id and date, after that it
scans and uses filter.......is this the order postgres works? maybe
sorting the table without filters before querying it costs a lot of
performance, only result should be sorted.
hmmm, i will start it with higher debuglevel and try to get out.
Post the query SQL and both EXPLAIN ANALYSE outputs along with table
definitions and row counts for each table involved and we can see exactly
where the problem is. If you think this is too much info, you can post it on
the web instead, that's fine.
If you're using the vmstat info, you could post that too, otherwise just keep
it safe in case it's needed.



thanks in advance, i updated http://www.erdtrabant.de/index.php?p=&l=de&i=60500

yours sincerely
volker


Richard Huxton wrote:
On Saturday 17 April 2004 13:23, Development - multi.art.studio wrote:

hello everyone,

Richard Huxton wrote:
It's not using the date index because it's using the id index - there's
only 10 matches, so that looks like a good choice to me. It takes less
than 1ms, so I'm not sure this is a good example of a problem.
thanks all !
but this doesnt help me a lot, i dont know what what to do? im not an
expert. ok, i could use another site_id and it will grow up and gives 500
or more rows back from the select clause. this takes a lot of more time.
would this help?


If that is the problem, that's what you'll need to post.
for me it looks as pg742 is not as performant as 7.1 ? and takes a lot of
more time, and doesnt look for the correct index? but why this? why does it
take longer than in 7.1? any ideas? what can i do?


It's not something people are generally seeing. In most cases performance is
the same or slightly better. For some queries it can be a lot better.

What I suggest:
1. Compare the two postgresql.conf files and any other config settings and
make sure you know what differences there are and why.
2. Identify what queries seem to be the cause of the problem, and pick one you
think is a good example.
3. VACCUM FULL and ANALYZE both 7.1 and 7.4 databases.
4. Stop the 7.4 server and run EXPLAIN ANALYSE (from psql) for 7.1. If you
want to make sure the data is cached, run it three times and use the last
one.
While you're running the explain, execute "vmstat 1 > vmstat.trace-7.1.txt" in
another terminal.
5. Stop the 7.1 server, restart 7.4 and run EXPLAIN ANALYSE (from psql) for
7.4, same as before, verify that the same results are returned. Same as
before for the caching.
Trace using vmstat for this too.

Post the query SQL and both EXPLAIN ANALYSE outputs along with table
definitions and row counts for each table involved and we can see exactly
where the problem is. If you think this is too much info, you can post it on
the web instead, that's fine.
If you're using the vmstat info, you could post that too, otherwise just keep
it safe in case it's needed.

Nov 23 '05 #15

P: n/a

On Thu, 22 Apr 2004, Development - multi.art.studio wrote:
i did some mistakes, site-id was wrong in my simple select querys.
(interesting that no matches would take double time on 7.4)

here are the 'right' results, showing to me postgres 7.4 is slightly
slower with simple querys (but 1000-times faster with more complex
querys ;-)

with production 7.1:

mcms09=> select count(id) from newsletter where aktiv=1 and
site_id='m200384-000';
count
-------
845
(1 row)

---some users where working on it and added two newsletter since last
dump....

mcms09=> explain select * from newsletter where aktiv=1 and
site_id='m200384-000' order by id desc,date desc;
NOTICE: QUERY PLAN:
Sort (cost=123.78..123.78 rows=841 width=84)
-> Seq Scan on newsletter (cost=0.00..82.95 rows=841 width=84)
EXPLAIN
and 7.4 test:

mcms=# select count(id) from newsletter where aktiv=1 and
site_id='m200384-000';
count
-------
843
(1 row)

mcms=# explain select * from newsletter where aktiv=1 and
site_id='m200384-000' order by id desc,date desc;
QUERY PLAN
---------------------------------------------------------------------------
Sort (cost=124.81..126.91 rows=841 width=598)
Sort Key: id, date
-> Seq Scan on newsletter (cost=0.00..83.95 rows=841 width=598)
Filter: ((aktiv = 1) AND ((site_id)::text = 'm200384-000'::text))
(4 rows)
so in 7.4 seq-scan is about '1' (ms or what?) slower than 7.1


The cost numbers from explain have no direct connection to real time.
You'd actually have to compare the time it took to get the results from
the two in order to see how they ran (and be very careful about caching
effects and the like).

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 23 '05 #16

P: n/a

On Thu, 22 Apr 2004, Development - multi.art.studio wrote:
i did some mistakes, site-id was wrong in my simple select querys.
(interesting that no matches would take double time on 7.4)

here are the 'right' results, showing to me postgres 7.4 is slightly
slower with simple querys (but 1000-times faster with more complex
querys ;-)

with production 7.1:

mcms09=> select count(id) from newsletter where aktiv=1 and
site_id='m200384-000';
count
-------
845
(1 row)

---some users where working on it and added two newsletter since last
dump....

mcms09=> explain select * from newsletter where aktiv=1 and
site_id='m200384-000' order by id desc,date desc;
NOTICE: QUERY PLAN:
Sort (cost=123.78..123.78 rows=841 width=84)
-> Seq Scan on newsletter (cost=0.00..82.95 rows=841 width=84)
EXPLAIN
and 7.4 test:

mcms=# select count(id) from newsletter where aktiv=1 and
site_id='m200384-000';
count
-------
843
(1 row)

mcms=# explain select * from newsletter where aktiv=1 and
site_id='m200384-000' order by id desc,date desc;
QUERY PLAN
---------------------------------------------------------------------------
Sort (cost=124.81..126.91 rows=841 width=598)
Sort Key: id, date
-> Seq Scan on newsletter (cost=0.00..83.95 rows=841 width=598)
Filter: ((aktiv = 1) AND ((site_id)::text = 'm200384-000'::text))
(4 rows)
so in 7.4 seq-scan is about '1' (ms or what?) slower than 7.1


The cost numbers from explain have no direct connection to real time.
You'd actually have to compare the time it took to get the results from
the two in order to see how they ran (and be very careful about caching
effects and the like).

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 23 '05 #17

P: n/a
Stephan Szabo <ss****@megazone.bigpanda.com> writes:
On Thu, 22 Apr 2004, Development - multi.art.studio wrote:
so in 7.4 seq-scan is about '1' (ms or what?) slower than 7.1
The cost numbers from explain have no direct connection to real time.


.... or at least, if your concern is poor choice of query plan, it's
folly to treat them as being equivalent to real time. Actual runtimes
are the things to compare.

Also, I wouldn't care to bet that estimated costs from 7.1 are very
comparable to 7.4, anyway. We rejiggered the cost models enough in
between that it's quite likely to be a meaningless comparison.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 23 '05 #18

P: n/a
Stephan Szabo <ss****@megazone.bigpanda.com> writes:
On Thu, 22 Apr 2004, Development - multi.art.studio wrote:
so in 7.4 seq-scan is about '1' (ms or what?) slower than 7.1
The cost numbers from explain have no direct connection to real time.


.... or at least, if your concern is poor choice of query plan, it's
folly to treat them as being equivalent to real time. Actual runtimes
are the things to compare.

Also, I wouldn't care to bet that estimated costs from 7.1 are very
comparable to 7.4, anyway. We rejiggered the cost models enough in
between that it's quite likely to be a meaningless comparison.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 23 '05 #19

P: n/a
Hallo all,

thanks for the great help.
i tried around a bit more, using different queries and so on, but
pgsql7.4 will always be somewhat slower than 7.1
finally i tested some php code , (after fixing all those wrong
integer-values of '' to 0 or null ..... ) on it
and it uses a lot of more time for fulltext search than 7.1 with less
rows in the table, also dynamis site-management
will sometimes be more than 1 second slower.
i cant explain whats wrong, i dont matter, i will update my db and live
with that, maybe optimize the querys, but theres not a lot what i can do
here ;-)

if somebody is more interested in debugging this and getting out why
performance decreases, tell me and i will send you the stuff, a test
account and all you need.
i cant examine the problem, only can help

thanks for all, your sincerely
volker
Tom Lane wrote:
Stephan Szabo <ss****@megazone.bigpanda.com> writes:

On Thu, 22 Apr 2004, Development - multi.art.studio wrote:

so in 7.4 seq-scan is about '1' (ms or what?) slower than 7.1

The cost numbers from explain have no direct connection to real time.


... or at least, if your concern is poor choice of query plan, it's
folly to treat them as being equivalent to real time. Actual runtimes
are the things to compare.

Also, I wouldn't care to bet that estimated costs from 7.1 are very
comparable to 7.4, anyway. We rejiggered the cost models enough in
between that it's quite likely to be a meaningless comparison.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 23 '05 #20

P: n/a
Hallo all,

thanks for the great help.
i tried around a bit more, using different queries and so on, but
pgsql7.4 will always be somewhat slower than 7.1
finally i tested some php code , (after fixing all those wrong
integer-values of '' to 0 or null ..... ) on it
and it uses a lot of more time for fulltext search than 7.1 with less
rows in the table, also dynamis site-management
will sometimes be more than 1 second slower.
i cant explain whats wrong, i dont matter, i will update my db and live
with that, maybe optimize the querys, but theres not a lot what i can do
here ;-)

if somebody is more interested in debugging this and getting out why
performance decreases, tell me and i will send you the stuff, a test
account and all you need.
i cant examine the problem, only can help

thanks for all, your sincerely
volker
Tom Lane wrote:
Stephan Szabo <ss****@megazone.bigpanda.com> writes:

On Thu, 22 Apr 2004, Development - multi.art.studio wrote:

so in 7.4 seq-scan is about '1' (ms or what?) slower than 7.1

The cost numbers from explain have no direct connection to real time.


... or at least, if your concern is poor choice of query plan, it's
folly to treat them as being equivalent to real time. Actual runtimes
are the things to compare.

Also, I wouldn't care to bet that estimated costs from 7.1 are very
comparable to 7.4, anyway. We rejiggered the cost models enough in
between that it's quite likely to be a meaningless comparison.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 23 '05 #21

This discussion thread is closed

Replies have been disabled for this discussion.