Buglist - Page 2 - PostgreSQL Database

Bo Lorentsen

Hi ...

I'm trying to convince my boss to use posgresql (I need RI, transactions
and views), but he keeps comparing the project to mysql. Until now, I
found the answers to he's questions on the www.postgresql.org page, but
now I'm lost :-)

Where do I find a list of bugs both found and solved, or will I need to
ask on the pgsql-bugs list to know the answer ?

Also have anyone tryed to compare the new transaction model in MySQL 4.x
to PostgreSQL ?

I'm looking forward to recive even more constructive arguements :-)

/BL
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 11 '05

Subscribe Post Reply

4897

Andrew Sullivan

On Thu, Aug 21, 2003 at 08:38:14PM +0530, Shridhar Daithankar wrote:

If a database is clean i.e. no dead tuple, an autovacuum daemon with 1 min
interval can achieve pretty much same result, isn't it?

But we're talking about the case of large, busy databases that have
already choked their disks. We have the same problem here in our
test machines. We start running load tests, and with vacuums nicely
scheduled and everything we start topping out on the performance
pretty quickly, because of I/O bottlenecks on the database. We know
the difference in I/O bandwidth between our test env. and the
production env., so we can put in a fudge factor for this; but that's
it.

A

--
----
Andrew Sullivan 204-4141 Yonge Street
Liberty RMS Toronto, Ontario Canada
<an****@libertyrms.info> M2P 2A8
+1 416 646 3304 x110
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 11 '05 #51

Edmund Dengler

Well, if they are locked waiting on vacuum, then vacuum should upgrade
it's priority to the highest waiting process (priority inheritance).
This way, vacuum will be running at a priority level equivalent to who is
waiting on it.

Regards,
Ed

On Thu, 21 Aug 2003, Andrew Sullivan wrote:

On Wed, Aug 20, 2003 at 11:41:41PM +0200, Karsten Hilbert wrote:
You mean, like, "nice 19" or so ?

ISTR someone reporting problems with locking on the performance list
from doing exactly that. The problem is that the vacuum back end
might take a lock and then not get any processor time -- in which
case everybody else gets their processor slice but can't do anything,
because they have to wait until the niced vacuum process gets back in
line.

A

--
----
Andrew Sullivan 204-4141 Yonge Street
Liberty RMS Toronto, Ontario Canada
<an****@libertyrms.info> M2P 2A8
+1 416 646 3304 x110
---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 11 '05 #52

Andrew Sullivan

On Thu, Aug 21, 2003 at 12:05:28PM -0400, Edmund Dengler wrote:

Well, if they are locked waiting on vacuum, then vacuum should upgrade
it's priority to the highest waiting process (priority inheritance).
This way, vacuum will be running at a priority level equivalent to who is
waiting on it.

Right, but all that intelligence is something that isn't in there
now. And anyway, the real issue is I/O, not processor.

A

--
----
Andrew Sullivan 204-4141 Yonge Street
Liberty RMS Toronto, Ontario Canada
<an****@libertyrms.info> M2P 2A8
+1 416 646 3304 x110
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 11 '05 #53

Edmund Dengler

What I am pointing out is that this is all the same issue, and that
solutions to the "we can't do priorities because of locking issues" have
existed for many years. I/O is the same as processors, it is a resource
that needs managing. So the intelligence can be made to exist, it just
needs to be made.

Now onto other questions: can vacuuming be done without locks? Can it be
done in parts (ie, lock only a bit)? Can the I/O be better managed? Is
this a general model that would work well?

I have plenty of queries that I would love to run on a "as the system
allows" basis, or on a "keep a bit of spare cycles or I/O for the
important stuff", but which I cannot specify. So a vote from me for any
mechanism that allows priorities to be specified. If this is a desired
feature, then comes the hard part of what is feasible, what can be done in
a reasonable amount of time, and of doing it.

Regards!
Ed

On Thu, 21 Aug 2003, Andrew Sullivan wrote:

On Thu, Aug 21, 2003 at 12:05:28PM -0400, Edmund Dengler wrote:
Well, if they are locked waiting on vacuum, then vacuum should upgrade
it's priority to the highest waiting process (priority inheritance).
This way, vacuum will be running at a priority level equivalent to who is
waiting on it.

Right, but all that intelligence is something that isn't in there
now. And anyway, the real issue is I/O, not processor.

A

--
----
Andrew Sullivan 204-4141 Yonge Street
Liberty RMS Toronto, Ontario Canada
<an****@libertyrms.info> M2P 2A8
+1 416 646 3304 x110
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 11 '05 #54

Manfred Koizar

On Thu, 21 Aug 2003 21:10:34 +0530, "Shridhar Daithankar"
<sh*****************@persistent.co.in> wrote:

Point I am trying to make is to tune FSM and autovacuum frequency
such that you catch all the dead tuples in RAM

You might be able to catch the pages with dead tuples in RAM, but
currently there's no way to keep VACUUM from reading in all the clean
pages, which can be far more ...

Servus
Manfred

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 11 '05 #55

Manfred Koizar

On Thu, 21 Aug 2003 17:56:02 -0400, Tom Lane <tg*@sss.pgh.pa.us>
wrote:

Conceivably it could be a win, though,
if you could do frequent "vacuum decent"s and only a full-scan vacuum
once in awhile (once a day maybe).

That's what I had in mind; similar to the current situation where you
can avoid expensive VACUUM FULL by doing lazy VACUUM frequently
enough.

Servus
Manfred

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 11 '05 #56

Jan Wieck

Manfred Koizar wrote:

On Thu, 21 Aug 2003 21:10:34 +0530, "Shridhar Daithankar"
<sh*****************@persistent.co.in> wrote:
Point I am trying to make is to tune FSM and autovacuum frequency
such that you catch all the dead tuples in RAM

You might be able to catch the pages with dead tuples in RAM, but
currently there's no way to keep VACUUM from reading in all the clean
pages, which can be far more ...

Which leads us to a zero gravity vacuum, that does the lazy vacuum for
pages currently available in the buffer cache only. And another pg_stat
column telling the number of tuples vacuumed so that an autovac has a
chance to avoid IO consuming vacuum runs for relations where 99% of the
dead tuples have been caught in memory.
Jan

--
#================================================= =====================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================= = Ja******@Yahoo.com #
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 11 '05 #57

Shridhar Daithankar

On 21 Aug 2003 at 18:46, Jan Wieck wrote:

Manfred Koizar wrote:
On Thu, 21 Aug 2003 21:10:34 +0530, "Shridhar Daithankar"
<sh*****************@persistent.co.in> wrote:
Point I am trying to make is to tune FSM and autovacuum frequency
such that you catch all the dead tuples in RAM

You might be able to catch the pages with dead tuples in RAM, but
currently there's no way to keep VACUUM from reading in all the clean
pages, which can be far more ...

Which leads us to a zero gravity vacuum, that does the lazy vacuum for
pages currently available in the buffer cache only. And another pg_stat
column telling the number of tuples vacuumed so that an autovac has a
chance to avoid IO consuming vacuum runs for relations where 99% of the
dead tuples have been caught in memory.

Since autovacuum issues vacuum analyze only, is it acceptable to say that this
is taken care of already?

Bye
Shridhar

--
"One size fits all": Doesn't fit anyone.
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 11 '05 #58

Manfred Koizar

On Fri, 22 Aug 2003 12:15:33 +0530, "Shridhar Daithankar"
<sh*****************@persistent.co.in> wrote:

Which leads us to a zero gravity vacuum, that does the lazy vacuum for
pages currently available in the buffer cache only. [...]

Since autovacuum issues vacuum analyze only, is it acceptable to say that this
is taken care of already?

Even a plain VACUUM (without FULL) scans the whole relation to find
the (possibly few) pages that need to be changed. We are trying to
find a way to avoid those needless reads of clean pages, because (a)
they are IOs competing with other disk operations and (b) they push
useful pages out of OS cache and (c) of PG shared buffers. The latter
might become a non-issue with LRU-k, 2Q or ARC. But (a) and (b)
remain.

Servus
Manfred

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 11 '05 #59

Tom Lane

Jan Wieck <Ja******@Yahoo.com> writes:

Shridhar Daithankar wrote:
Umm.. What does FSM does then? I was under impression that FSM stores page
pointers and vacuum work on FSM information only. In that case, it wouldn't
have to waste time to find out which pages to clean.
It's the other way around! VACUUM scan's the tables to find and reclaim
free space and remembers that free space in the FSM.

Right. One big question mark in my mind about these "partial vacuum"
proposals is whether they'd still allow adequate FSM information to be
maintained. If VACUUM isn't looking at most of the pages, there's no
very good way to acquire info about where there's free space.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 11 '05 #60

Jan Wieck

Tom Lane wrote:

Jan Wieck <Ja******@Yahoo.com> writes:
Shridhar Daithankar wrote:
Umm.. What does FSM does then? I was under impression that FSM stores page
pointers and vacuum work on FSM information only. In that case, it wouldn't
have to waste time to find out which pages to clean.

It's the other way around! VACUUM scan's the tables to find and reclaim
free space and remembers that free space in the FSM.

Right. One big question mark in my mind about these "partial vacuum"
proposals is whether they'd still allow adequate FSM information to be
maintained. If VACUUM isn't looking at most of the pages, there's no
very good way to acquire info about where there's free space.

That's why I think it needs one more pg_stat column to count the number
of vacuumed tuples. If one does

tuples_updated + tuples_deleted - tuples_vacuumed

he'll get approximately the number of tuples a regular vacuum might be
able to reclaim. If that number is really small, no need for autovacuum
to cause any big trouble by scanning the relation.

Another way to give autovacuum some hints would be to return some number
as commandtuples from vacuum. like the number of tuples actually
vacuumed. That together with the new number of reltuples in pg_class
will tell autovacuum how frequent a relation really needs scanning.
Jan

--
#================================================= =====================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================= = Ja******@Yahoo.com #
---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 11 '05 #61

Matthew T. O'Connor

On Fri, 2003-08-22 at 10:45, Tom Lane wrote:

Jan Wieck <Ja******@Yahoo.com> writes:
Right. One big question mark in my mind about these "partial vacuum"
proposals is whether they'd still allow adequate FSM information to be
maintained. If VACUUM isn't looking at most of the pages, there's no
very good way to acquire info about where there's free space.

Well, pg_autovacuum really needs to be looking at the FSM anyway. It
could look at the FSM, and choose to to do a vacuum normal when there
the amount of FSM data becomes inadequate. Of course I'm not sure how
you would differentiate a busy table with "inadequate" FSM data and an
inactive table that doesn't even register in the FSM. Perhaps you would
still need to consult the stats system.
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 11 '05 #62

Tom Lane

Jan Wieck <Ja******@Yahoo.com> writes:

Okay, my proposal would be to have a VACUUM mode where it tells the
buffer manager to only return a page if it is already in memory, and
some "not cached" if it would have to read it from disk, and simply skip
the page in that case.

Since no such call is available at the OS level, this would only work
well with very large shared_buffers settings (ie, you try to rely on
PG shared buffers to the exclusion of kernel disk cache). AFAIK the
general consensus is that that's not a good way to run Postgres.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 11 '05 #63

Franco Bruno Borghesi

Initial beta release of plPHP http://www.postgresql.org/news/143.html

On Tue, 2003-08-19 at 10:46, David Siebert wrote:

I learned MySQL then went on to Postgres. I chose postgres for my in
house project just because of the row locking and transactions. Looking
back I could have used MySQL. I have yet to use stored procedures or
many of the high level functions of Postgres however transactions make
things so much cleaner. I do not think MySQL is a bad system. It works
well for many people in many situations. I think that MySQL and SAP
getting together could be very exciting. When it comes to SQL databases
I would say we have a wealth good choices. This if I use PHP I have to
use MySQL is a load of tripe. PHP can work just fine with Postgres. I
hate to even suggest this but has anyone thought of adding PHP to the
languages that you can use to write stored procedures in Postgres?

Roderick A. Anderson wrote:

On 19 Aug 2003, Bo Lorentsen wrote:

Also have anyone tryed to compare the new transaction model in MySQL 4.x
to PostgreSQL ?

Bo, I've recently started having to deal with MySQL. (Web sites
wanting/using php _need/have-to-have_ MySQL. Their words not mine.) And
from going from a "I dislike MySQL" to "I'm really hating MySQL" has been
getting easier and easier.
My dealings with MySQL are for the 3.xx version but I semi-followed a
thread on this several months ago so feel fully qualified to to throw in
my views. :-) My take on others research was that MySQL transaction
model is a bubble gum and bailing wire add on not an integral part of
MySQL. It _was_ tacked onto the top of the database so if either it or
MySQL failed you were likely to loose data.

I'm looking forward to recive even more constructive arguements :-)

How about "Friends don't let friends use MySQL"?

Hopefully others with a stonger knowledge will provide this.
Rod

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (FreeBSD)

iD8DBQA/Sma821dVnhLsBV0RAgMPAJ9+Bz118FPgUQeudaxcfUqkRLzMQg CfTR4/
ba6G/3dmHwLp+NBEfUQRquU=
=qj7z
-----END PGP SIGNATURE-----

Nov 11 '05 #64

Bruce Momjian

Jan Wieck wrote:

Manfred Koizar wrote:
On Thu, 21 Aug 2003 21:10:34 +0530, "Shridhar Daithankar"
<sh*****************@persistent.co.in> wrote:
Point I am trying to make is to tune FSM and autovacuum frequency
such that you catch all the dead tuples in RAM

You might be able to catch the pages with dead tuples in RAM, but
currently there's no way to keep VACUUM from reading in all the clean
pages, which can be far more ...

Which leads us to a zero gravity vacuum, that does the lazy vacuum for
pages currently available in the buffer cache only. And another pg_stat
column telling the number of tuples vacuumed so that an autovac has a
chance to avoid IO consuming vacuum runs for relations where 99% of the
dead tuples have been caught in memory.

What would be really interesting is to look for dead tuples when you
write/discard a buffer page and add them to the FSM --- that is probably
the latest time you still have access to the page and has the highest
probability of being recyclable.

--
Bruce Momjian | http://candle.pha.pa.us
pg***@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 11 '05 #65

Jan Wieck

Bruce Momjian wrote:

Jan Wieck wrote:
Manfred Koizar wrote:
> On Thu, 21 Aug 2003 21:10:34 +0530, "Shridhar Daithankar"
> <sh*****************@persistent.co.in> wrote:
>>Point I am trying to make is to tune FSM and autovacuum frequency
>>such that you catch all the dead tuples in RAM
>
> You might be able to catch the pages with dead tuples in RAM, but
> currently there's no way to keep VACUUM from reading in all the clean
> pages, which can be far more ...

Which leads us to a zero gravity vacuum, that does the lazy vacuum for
pages currently available in the buffer cache only. And another pg_stat
column telling the number of tuples vacuumed so that an autovac has a
chance to avoid IO consuming vacuum runs for relations where 99% of the
dead tuples have been caught in memory.

What would be really interesting is to look for dead tuples when you
write/discard a buffer page and add them to the FSM --- that is probably
the latest time you still have access to the page and has the highest
probability of being recyclable.

True, but it's again in the time critical path of a foreground
application because it's done by a backend who has to read another page
on behalf of a waiting client right now. Also, there is only a small
probability that all the pages required to do the index purge for the
tuples reclaimed are in memory too. Plus there is still no direct
connection between a heap tuples ctid and the physical location of it's
index tuples, so purging an index requires a full scan of it, which is
best done in bulk operations.
Jan

--
#================================================= =====================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================= = Ja******@Yahoo.com #
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 11 '05 #66

Christopher Browne

After a long battle with technology,kh***@kcilink.com (Vivek Khera), an earthling, wrote:

>> "TL" == Tom Lane <tg*@sss.pgh.pa.us> writes:

TL> Just nice'ing the VACUUM process is likely to be counterproductive
TL> because of locking issues (priority inversion). Though if anyone cares
TL> to try it on a heavily-loaded system, I'd be interested to hear the
TL> results...

tried it once. didn't make much difference except that vacuum took
longer than normal. i didn't see any deadlocks.

i actually figured out what my main problem was. vacuum every 6 hours
on my two busiest tables was taking longer than 6 hours when we were
very busy...

I "wedged" a database server once that way; it was busy, busy, busy
with a multiplicity of processes trying to simultaneously vacuum the
same table.

The "new generation" resolution to that is pg_autovacuum; if you're
running a pre-7.3 version, a good idea is basically to have a vacuum
script that checks a "lock file" and exits if it sees that another
process is already busy vacuuming.
--
output = reverse("gro.mca" "@" "enworbbc")
http://www.ntlug.org/~cbbrowne/postgresql.html
"I am aware of the benefits of a micro kernel approach. However, the
fact remains that Linux is here, and GNU isn't --- and people have
been working on Hurd for a lot longer than Linus has been working on
Linux." -- Ted T'so, 1992.

Nov 11 '05 #67