473,396 Members | 1,847 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

How estimated rows is running ?

Hi,

I have a database with one table with about 22 millions records.
I have a script every day wich insert about 200 000 records ...
Sometime my script takes 4 hours to insert the data, sometime 1 hour.

Then I have a question does for inserting data PostgreSQL use somewhere the
number of estimated rows ... ? For me no ... but ...

For example ...

mybases=# ANALYZE VERBOSE my_table;
INFO: analyzing "public.my_table"
INFO: "my_table": 434342 pages, 30000 rows sampled, 22585030 estimated total
rows
ANALYZE
Time: 173317.410 ms

I insert 200 000 data ... 4 hours ...

Then I get back to the same situation (other server same database in same
situation before the insert ... (same analyze as before)) ... I do the same
Analyze ... then I do a vacuum analyze like this after ... Why the estimated
total rows of the first index is not in correlation with the analyze done
before ? Why the total at the end of the vacuum is also different ? And why
after the vacuum analyze the insert takes 1 hour ??

mybases=# VACUUM VERBOSE ANALYZE my_table;
INFO: vacuuming "public.my_table"
INFO: index "ix_my_table_compte" now contains 21897465 row versions in 113659
pages
DETAIL: 270724 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 12.23s/20.53u sec elapsed 824.63 sec.
INFO: index "ix_my_table_idcontract" now contains 21897465 row versions in
65647 pages
DETAIL: 270724 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 4.32s/14.52u sec elapsed 337.45 sec.
INFO: index "ix_my_table_arrete_week" now contains 21897465 row versions in
87723 pages
DETAIL: 270724 index row versions were removed.
2780 index pages have been deleted, 1689 are currently reusable.
CPU 8.68s/11.40u sec elapsed 754.39 sec.
INFO: "my_table": removed 270724 row versions in 5209 pages
DETAIL: CPU 0.27s/0.74u sec elapsed 15.40 sec.
INFO: "my_table": found 270724 removable, 21897465 nonremovable row versions
in 434342 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 416707 unused item pointers.
0 pages are entirely empty.
CPU 44.14s/52.89u sec elapsed 2247.39 sec.
INFO: "my_table": truncated 434342 to 431553 pages
DETAIL: CPU 0.06s/0.03u sec elapsed 0.08 sec.
INFO: vacuuming "pg_toast.pg_toast_917442"
INFO: index "pg_toast_917442_index" now contains 0 row versions in 1 pages
DETAIL: 0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_toast_917442": found 0 removable, 0 nonremovable row versions in 0
pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.06 sec.
INFO: analyzing "public.my_table"
INFO: "my_table": 431553 pages, 30000 rows sampled, 22440007 estimated total
rows
VACUUM
Time: 2493147.338 ms

Thanks for you replies,
--
Hervé Piedvache

Elma Ingénierie Informatique
6 rue du Faubourg Saint-Honoré
F-75008 - Paris - France
Pho. 33-144949901
Fax. 33-144949902

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 23 '05 #1
2 1900
On Wed, 7 Apr 2004, [iso-8859-15] Hervé Piedvache wrote:
Hi,

I have a database with one table with about 22 millions records.
I have a script every day wich insert about 200 000 records ...
Sometime my script takes 4 hours to insert the data, sometime 1 hour.


Are you wrapping all your inserts into one or several large transactions?
This can make inserting data MUCH faster in postgresql.

Do you have fk/pk relationships? If so, are they of the same types?

I wouldn't worry about the vacuum / analyze numbers being a little off. A
small variance of a few percent is no biggie, it's when things are off by
factors that the planner gets confused and makes bad decisions.
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #2
Scott,

Le mercredi 7 Avril 2004 22:13, scott.marlowe a écrit :
On Wed, 7 Apr 2004, [iso-8859-15] Hervé Piedvache wrote:

I have a database with one table with about 22 millions records.
I have a script every day wich insert about 200 000 records ...
Sometime my script takes 4 hours to insert the data, sometime 1 hour.
Are you wrapping all your inserts into one or several large transactions?
This can make inserting data MUCH faster in postgresql.


It's one transaction ... because it's an INSERT from SELECT ...
Do you have fk/pk relationships? If so, are they of the same types?
No no fk/pk ...
I wouldn't worry about the vacuum / analyze numbers being a little off. A
small variance of a few percent is no biggie, it's when things are off by
factors that the planner gets confused and makes bad decisions.


Yes but he gets confused of about 150 000 rows ... it's few in front of the 22
millions ... but it's done just after an Analyze ... so why the analyze do
not get the information ??

Regards,
--
Hervé Piedvache

Elma Ingénierie Informatique
6 rue du Faubourg Saint-Honoré
F-75008 - Paris - France
Pho. 33-144949901
Fax. 33-144949902

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: robert | last post by:
running 8.1.7 server, 8.1.6 client. i *thought* inner join should not return nulls, but not only that, but i get way more rows than i'm expecting. assume: order table: order_number
2
by: T Chaudhary | last post by:
Hi, I have a question about estimated query execution plans that are generated in QA of MSSQL. If I point at an icon/physical operator in the estimated QEP, it shows me some statistics about...
0
by: Tom | last post by:
It doesn't look like Microsoft will be hosting the PDC next year. Does anyone have any recommendations and estimated cost for something similar? (We need this info for budget purposes)
3
by: Kumar | last post by:
Hi, I have a table (TAB1) withi has 4 child tables on COL1 with DELETE CASCADE. When I delete any row in TAB1, it will delete corresponding rows on all child tables. SQL : delete from TAB1...
1
by: Michal Januszczyk | last post by:
I build some web aplication that needs to return some binary data using Response.BinaryWrite(byte data) All is fine, but one thing: the browser does not display "estimated time left" and the size of...
5
by: Lars Netzel | last post by:
Hello! I have a number of files to copy. I have to total Amount of files and the Total FileSize of this operation (for example 3453 files, 4065.4 Mb)... How can I calculate how much time this...
2
by: Ina Schmitz | last post by:
Hi NG, does IBM Universal Database 8.2 make any difference between actual and estimated execution plans like in SQL Server ("set showplan_all on" for estimated execution plan and "set statistics...
4
by: cbrichards via SQLMonster.com | last post by:
I have a stored procedure that will execute with less than 1,000 reads one time (with a specified set of parameters), then with a different set of parameters the procedure executes with close to...
0
by: awyeah | last post by:
Hello everyone, I have some very difficult task to accomplish using ADODB, or DAO in visual basic 6 whichever. This is my masters project and I am stuck!! Say I have a recordset with 4 customers...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.