DB cache size strategies

Ed L.

Here's some of my current notions on pgsql performance tuning strictly as it
relates to pgsql tuning parameters in the context of a dedicated linux or
hpux server. I'm particularly focusing on the shared_buffers setting. I
invite any corrective or confirming feedback. I realize there are many
other hugely important performance factors outside this scope.

One key aspect of pgsql performance tuning is to adjust the memory
consumption settings (shared_buffers, sort_mem, etc) large enough to hold
as much of the database in shared memory as possible while not causing page
swap-ins. I understand that both page swap-outs and swap space usage is
normal and OK, but lots of page swap-ins are bad). In other words, for
absolute fastest performance, we want a database buffer cache hit rate of
as close to 100% as possible.

Now, how do we accurately measure this cache hit rate at any one point?
Well, here's what I currently know as the best measure for a given cluster
of databases:

SELECT SUM(pg_stat_get_db_blocks_hit(d.oid))
/ SUM(pg_stat_get_db_blocks_fetched(d.oid)) AS hit_rate
FROM pg_database d;

How do we identify when we have sub-optimal pgsql configuration parameters
in this regard? My understanding: to whatever extent the hit rate is
below 100%, our shared buffers are not large enough to hold the data we're
accessing.

How do we fix the problem? We want to increase the shared_buffers setting
and/or maybe some combination sort_mem, wal_buffers, vacuum_mem, reset our
stats, and resume monitoring. I assume which of these we increase for
maximum benefit depends on the sources of the cache overflow (routine
queries, sorting, vacuuming), but that shared_buffers is the most impactful
in practice. Again, we want to do this without causing page swap-ins.
When you see swap-ins, you've gone too far. If we experience swap-ins and
less than 100% cache hit rate by any significant amount, then we need more
RAM or less RAM consumption.

The end result is that page swap-ins (sar -W) and cache hit rate (query
above) are two very key measures to use as warning bells.

Glad to hear your thoughts.

Ed
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 22 '05 #1

Subscribe Post Reply

6197

Ed L.

On Friday January 30 2004 2:33, Ed L. wrote:

One key aspect of pgsql performance tuning is to adjust the memory
consumption settings (shared_buffers, sort_mem, etc) large enough to hold
as much of the database in shared memory as possible while not causing
page swap-ins. I understand that both page swap-outs and swap space
usage is normal and OK, but lots of page swap-ins are bad). In other
words, for absolute fastest performance, we want a database buffer cache
hit rate of as close to 100% as possible.

I'm also curious about the relationship of DB shared buffer cache to the
linux/hpux kernel caches. In particular, if the block being retrieved in
pgsql was in the kernel's cache but not in the DB cache, thereby forcing a
read() system call, what kind of quantitative difference in performance
would one expect when comparing with block retrievals coming from the
cache? I would think they'd differ only by something on the order of
microseconds. Is the linux kernel disk cache normally a duplicate of much
of what is in the DB cache? For linux, does the kernel cache use only
"available" memory until a program needs it, while the pgsql DB cache
memory is guaranteed at startup?

TIA.

Ed

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 22 '05 #2

Martijn van Oosterhout

On Fri, Jan 30, 2004 at 03:19:56PM -0700, Ed L. wrote:

On Friday January 30 2004 2:33, Ed L. wrote:

One key aspect of pgsql performance tuning is to adjust the memory
consumption settings (shared_buffers, sort_mem, etc) large enough to hold
as much of the database in shared memory as possible while not causing
page swap-ins. I understand that both page swap-outs and swap space
usage is normal and OK, but lots of page swap-ins are bad). In other
words, for absolute fastest performance, we want a database buffer cache
hit rate of as close to 100% as possible.
I'm also curious about the relationship of DB shared buffer cache to the
linux/hpux kernel caches. In particular, if the block being retrieved in
pgsql was in the kernel's cache but not in the DB cache, thereby forcing a
read() system call, what kind of quantitative difference in performance
would one expect when comparing with block retrievals coming from the
cache? I would think they'd differ only by something on the order of
microseconds. Is the linux kernel disk cache normally a duplicate of much
of what is in the DB cache? For linux, does the kernel cache use only
"available" memory until a program needs it, while the pgsql DB cache
memory is guaranteed at startup?

Whenever the database needs a block not in memory it get loaded and ends up
in both the OS cache and the DB cache. The difference between getting a
block out of DB cache and OS cache is very, very small compared to loading
off disk.

Hence, the larger your DB cache, the less overall data you can cache due to
the duplication. OTOH, postgres can only directly use the DB cache and can
only access the OS cache indirectly.

What I try to aim for is the keep the DB cache somewhere near the "working
set", ie the amount of space used by all the system tables and commonly used
tables in the system. It's shared by all postgres servers in a cluster so
keep that in mind. Also the OS cache may be shared by other processes, so
your results will depend on what other things are happening.

Finally, at least on Linux, the shared memory postgres uses for cache can
also be swapped out making it very difficult to determine the correct value.

Hope this helps,

--
Martijn van Oosterhout <kl*****@svana.org> http://svana.org/kleptog/ (... have gone from d-i being barely usable even by its developers
anywhere, to being about 20% done. Sweet. And the last 80% usually takes
20% of the time, too, right?) -- Anthony Towns, debian-devel-announce

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQFAGv+nY5Twig3Ge+YRAlobAKDUzlZXSZt+tDn5c3ewkS uj965lMACeKUT1
AV/p/m2Pvf8QUF3MTH/opXk=
=XaoN
-----END PGP SIGNATURE-----

Nov 22 '05 #3

Jan Wieck

But keep in mind that all these strategies need to be reevaluated with
7.5 and its ARC.
Jan

Martijn van Oosterhout wrote:

On Fri, Jan 30, 2004 at 03:19:56PM -0700, Ed L. wrote:
On Friday January 30 2004 2:33, Ed L. wrote:
>
> One key aspect of pgsql performance tuning is to adjust the memory
> consumption settings (shared_buffers, sort_mem, etc) large enough to hold
> as much of the database in shared memory as possible while not causing
> page swap-ins. I understand that both page swap-outs and swap space
> usage is normal and OK, but lots of page swap-ins are bad). In other
> words, for absolute fastest performance, we want a database buffer cache
> hit rate of as close to 100% as possible.

I'm also curious about the relationship of DB shared buffer cache to the
linux/hpux kernel caches. In particular, if the block being retrieved in
pgsql was in the kernel's cache but not in the DB cache, thereby forcing a
read() system call, what kind of quantitative difference in performance
would one expect when comparing with block retrievals coming from the
cache? I would think they'd differ only by something on the order of
microseconds. Is the linux kernel disk cache normally a duplicate of much
of what is in the DB cache? For linux, does the kernel cache use only
"available" memory until a program needs it, while the pgsql DB cache
memory is guaranteed at startup?

Whenever the database needs a block not in memory it get loaded and ends up
in both the OS cache and the DB cache. The difference between getting a
block out of DB cache and OS cache is very, very small compared to loading
off disk.

Hence, the larger your DB cache, the less overall data you can cache due to
the duplication. OTOH, postgres can only directly use the DB cache and can
only access the OS cache indirectly.

What I try to aim for is the keep the DB cache somewhere near the "working
set", ie the amount of space used by all the system tables and commonly used
tables in the system. It's shared by all postgres servers in a cluster so
keep that in mind. Also the OS cache may be shared by other processes, so
your results will depend on what other things are happening.

Finally, at least on Linux, the shared memory postgres uses for cache can
also be swapped out making it very difficult to determine the correct value.

Hope this helps,

--
#================================================= =====================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================= = Ja******@Yahoo.com #
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 22 '05 #4

Ed L.

On Friday January 30 2004 6:06, Martijn van Oosterhout wrote:

On Fri, Jan 30, 2004 at 03:19:56PM -0700, Ed L. wrote:

I'm also curious about the relationship of DB shared buffer cache to
the linux/hpux kernel caches. ...
Whenever the database needs a block not in memory it get loaded and ends
up in both the OS cache and the DB cache. The difference between getting
a block out of DB cache and OS cache is very, very small compared to
loading off disk.

So the OS cache and the DB cache may well have duplicate copies of the data?
With OS cache being more susceptible to preemption from other processes?
Finally, at least on Linux, the shared memory postgres uses for cache can
also be swapped out making it very difficult to determine the correct
value.

What would cause DB shared memory to be swapped out in linux? I thought it
was all pre-allocated at pgsql startup. Is this the "well its not really
pre-allocated but rather supplied on demand" story? I think I saw some
sort of kernel parameter controlling a similar (same?) policy...

TIA.

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 22 '05 #5

Martijn van Oosterhout

On Tue, Feb 10, 2004 at 01:20:32PM -0700, Ed L. wrote:

On Friday January 30 2004 6:06, Martijn van Oosterhout wrote:
On Fri, Jan 30, 2004 at 03:19:56PM -0700, Ed L. wrote:

I'm also curious about the relationship of DB shared buffer cache to
the linux/hpux kernel caches. ...
Whenever the database needs a block not in memory it get loaded and ends
up in both the OS cache and the DB cache. The difference between getting
a block out of DB cache and OS cache is very, very small compared to
loading off disk.

So the OS cache and the DB cache may well have duplicate copies of the data?
With OS cache being more susceptible to preemption from other processes?

Will have. Any time postgresql does a read() the kernel is copying a buffer
from it's cache to postgresql's internal cache. The kernel cache is more
susceptable to reuse by other processes, but it's still double.

Finally, at least on Linux, the shared memory postgres uses for cache can
also be swapped out making it very difficult to determine the correct
value.

What would cause DB shared memory to be swapped out in linux? I thought it
was all pre-allocated at pgsql startup. Is this the "well its not really
pre-allocated but rather supplied on demand" story? I think I saw some
sort of kernel parameter controlling a similar (same?) policy...

Memory pressure. Same rules apply as for everything else I beleive. If you
allocate a huge amount to shared memory, fill it up and then don't use it
for a while while the rest of the system needs the memory, it'll get swapped
out.

I generally give Postgresql about 64-128MB of shared memory, which covers
all of the system tables and the most commonly used small tables. The rest
of the memory (this is a 1GB machine) I leave for the kernel to manage for
the very large tables.

I think giving postgresql half your memory is couterproductive.

--
Martijn van Oosterhout <kl*****@svana.org> http://svana.org/kleptog/ (... have gone from d-i being barely usable even by its developers
anywhere, to being about 20% done. Sweet. And the last 80% usually takes
20% of the time, too, right?) -- Anthony Towns, debian-devel-announce

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQFAKUI3Y5Twig3Ge+YRAjYrAJ4zaoXQCtYI8tsXqXDl7W 4tKIDtqgCgi7+u
Qa/QJnTIWaYt8IfdBhrKrnU=
=kntV
-----END PGP SIGNATURE-----

Nov 22 '05 #6

Ed L.

On Tuesday February 10 2004 1:42, Martijn van Oosterhout wrote:

I generally give Postgresql about 64-128MB of shared memory, which covers
all of the system tables and the most commonly used small tables. The
rest of the memory (this is a 1GB machine) I leave for the kernel to
manage for the very large tables.
Interesting. Why leave very large tables to the kernel instead of the db
cache? Assuming a dedicated DB server and a DB smaller than available RAM,
why not give the DB enough RAM to get the entire DB into the DB cache?
(Assuming you have the RAM).

Is there a point of diminishing returns for DB cache size *prior* to running
out of data to load into, and access from, that cache? Anyone know how to
guage that point? Maybe shared memory management overhead versus the
performance difference between DB cache reads and OS cache reads? We have
some pretty intense DB loads, but that feels like counting angels on the
head of a pin...
I think giving postgresql half your memory is couterproductive.

I get the sense it really depends on the nature of the apps running on the
box. If it's a dedicated server (not sharing with an app, a web server,
etc), then giving it enough to get the entire data in memory without
swapping is ideal (without swapping), no matter whether thats 10MB or 10GB
of DB cache. If its a cluster server running along side lots of other apps
and they all generally have the same priority, and assuming disk cache
reads are nearly as fast as DB cache reads, it seems the less DB cache you
have, the better; let all apps leverage the kernel cache equally. Would
you agree?

One troublesome case for which I wonder if there is an easy tweak
unbeknownst to me: a huge table that gets lots of random reads and writes.
It's too big for the cache, and the volume of random queries is such that
its a disk I/O drain. Of course, more memory or better data modeling could
help. Maybe that is the worst case scenario.

Thx.

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 22 '05 #7

scott.marlowe

On Tue, 10 Feb 2004, Ed L. wrote:

On Tuesday February 10 2004 1:42, Martijn van Oosterhout wrote:
I generally give Postgresql about 64-128MB of shared memory, which covers
all of the system tables and the most commonly used small tables. The
rest of the memory (this is a 1GB machine) I leave for the kernel to
manage for the very large tables.

Interesting. Why leave very large tables to the kernel instead of the db
cache? Assuming a dedicated DB server and a DB smaller than available RAM,
why not give the DB enough RAM to get the entire DB into the DB cache?
(Assuming you have the RAM).

Because the kernel is more efficient (right now) at caching large data
sets.

With the ARC cache manager that will likely wend it's way into 7.5, it's
quite a likely possibility that postgresql will be able to efficiently
handle a larger cache, but it will still be a shared memory cache, and
those are still usually much slower than the kernel's cache.
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 22 '05 #8

Ed L.

On Tuesday February 10 2004 3:48, scott.marlowe wrote:

On Tue, 10 Feb 2004, Ed L. wrote:
Interesting. Why leave very large tables to the kernel instead of the
db cache? Assuming a dedicated DB server and a DB smaller than
available RAM, why not give the DB enough RAM to get the entire DB into
the DB cache? (Assuming you have the RAM).

Because the kernel is more efficient (right now) at caching large data
sets.

With the ARC cache manager that will likely wend it's way into 7.5, it's
quite a likely possibility that postgresql will be able to efficiently
handle a larger cache, but it will still be a shared memory cache, and
those are still usually much slower than the kernel's cache.

Hmmm. Others have asserted/assumed they'd be roughly equivalent. It'd be
interesting to see some real data measuring the performance of the shared
mem cache vs. kernel cache. Anyone know of existing benchmarks?
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 22 '05 #9

scott.marlowe

On Tue, 10 Feb 2004, Ed L. wrote:

On Tuesday February 10 2004 3:48, scott.marlowe wrote:
On Tue, 10 Feb 2004, Ed L. wrote:
Interesting. Why leave very large tables to the kernel instead of the
db cache? Assuming a dedicated DB server and a DB smaller than
available RAM, why not give the DB enough RAM to get the entire DB into
the DB cache? (Assuming you have the RAM).

Because the kernel is more efficient (right now) at caching large data
sets.

With the ARC cache manager that will likely wend it's way into 7.5, it's
quite a likely possibility that postgresql will be able to efficiently
handle a larger cache, but it will still be a shared memory cache, and
those are still usually much slower than the kernel's cache.

Hmmm. Others have asserted/assumed they'd be roughly equivalent. It'd be
interesting to see some real data measuring the performance of the shared
mem cache vs. kernel cache. Anyone know of existing benchmarks?

Testing from various folks has shown that there is a maximum size of
somewhere around 256 meg or so where the returns diminish pretty quickly,
unless your working data sets are larger than that. I.e. setting a
database up that needs 20 megs to throw data around in with 200 megs of
buffers is usually slower than handing it 32 megs of buffer and letting
the kernel do it.

But ARC may change all of that. I'll have to grab a CVS snapshot and see
how it behaves with large datasets and shared buffers...
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 22 '05 #10

Jenny Zhang

On Tue, 2004-02-10 at 16:17, scott.marlowe wrote:

On Tue, 10 Feb 2004, Ed L. wrote:
On Tuesday February 10 2004 3:48, scott.marlowe wrote:
On Tue, 10 Feb 2004, Ed L. wrote:
> Interesting. Why leave very large tables to the kernel instead of the
> db cache? Assuming a dedicated DB server and a DB smaller than
> available RAM, why not give the DB enough RAM to get the entire DB into
> the DB cache? (Assuming you have the RAM).

Because the kernel is more efficient (right now) at caching large data
sets.

With the ARC cache manager that will likely wend it's way into 7.5, it's
quite a likely possibility that postgresql will be able to efficiently
handle a larger cache, but it will still be a shared memory cache, and
those are still usually much slower than the kernel's cache.

Hmmm. Others have asserted/assumed they'd be roughly equivalent. It'd be
interesting to see some real data measuring the performance of the shared
mem cache vs. kernel cache. Anyone know of existing benchmarks?

Testing from various folks has shown that there is a maximum size of
somewhere around 256 meg or so where the returns diminish pretty quickly,
unless your working data sets are larger than that. I.e. setting a
database up that needs 20 megs to throw data around in with 200 megs of
buffers is usually slower than handing it 32 megs of buffer and letting
the kernel do it.

But ARC may change all of that. I'll have to grab a CVS snapshot and see
how it behaves with large datasets and shared buffers...

The latest osdl-dbt3 run on our stp system used about 250M for shared
buffer. The database was about 3.8G. The system has 8G of RAM, most of
which are used by the database. I did try to increase the shared_buffer
but the performance decreased. For this workload, the bottleneck is the
I/O for the pgsql_tmp directory, since those are complicated queries
with group by and sorting.

The result is at:
http://khack.osdl.org/stp/287555/

It would be interesting to see how 7.5 handles this workload. Also you
are welcome to use STP to try out different database parameters with
this workload. Let me know if you are interested, I will follow up with
more instructions.

Jenny
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 22 '05 #11

Tom Lane

"Ed L." <pg***@bluepolka.net> writes:

On Tuesday February 10 2004 1:42, Martijn van Oosterhout wrote:
I think giving postgresql half your memory is couterproductive.
I get the sense it really depends on the nature of the apps running on the
box.

Giving PG half the RAM is counterproductive no matter what --- that
pretty much guarantees that every page that's in RAM will be in RAM
twice, once in PG buffers and once in kernel buffers. The two
reasonable schools of thought are (1) to make PG buffers relatively
small and let the kernel do the bulk of the buffering, or (2) to give
PG most of the RAM and expect it to do the bulk of the buffering.

Past experience has been that theory (1) generally wins. Jan's recent
work on ARC buffering may make things more interesting, though.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 22 '05 #12

Christopher Browne

Martha Stewart called it a Good Thing when pg***@bluepolka.net ("Ed L.") wrote:

On Tuesday February 10 2004 3:48, scott.marlowe wrote:
On Tue, 10 Feb 2004, Ed L. wrote:
> Interesting. Why leave very large tables to the kernel instead
> of the db cache? Assuming a dedicated DB server and a DB smaller
> than available RAM, why not give the DB enough RAM to get the
> entire DB into the DB cache? (Assuming you have the RAM).

Because the kernel is more efficient (right now) at caching large
data sets.

With the ARC cache manager that will likely wend it's way into 7.5,
it's quite a likely possibility that postgresql will be able to
efficiently handle a larger cache, but it will still be a shared
memory cache, and those are still usually much slower than the
kernel's cache.

Hmmm. Others have asserted/assumed they'd be roughly equivalent.
It'd be interesting to see some real data measuring the performance
of the shared mem cache vs. kernel cache. Anyone know of existing
benchmarks?

The problem with the shared memory cache is that there seems to be
some variation from OS to OS as to how efficiently the system can
manage shared memory as the amount of it grows.

The same is commonly true for various sorts of resources:

- On Linux, for a long time, you didn't want to have a lot of files in
one directory as it did a linked-list scan to look for files.
ReiserFS specifically planned an improvement to this by replacing
the linked list with a b-tree.

- Linux and Solaris are conspicuously different in their process
scheduling models. With Solaris, performance suffers if you have
a great deal of processes; the architecture encourages use of
threading to get good performance. In contrast, Linux 2.6 has
what they call an "O(1)" process scheduler where having lots of
processes does not substantially slow down OS performance.

The benchmarks have been a bit limited thus far; people haven't found
that having much more than 10000 buffer pages to be particularly
helpful. And there has been some indication that sizes larger than
that can slow things down. And that may not be a constant across OS
implementations...
--
"cbbrowne","@","cbbrowne.com"
http://www3.sympatico.ca/cbbrowne/rdbms.html
Rules of the Evil Overlord #151. "I will not set myself up as a
god. That perilous position is reserved for my trusted lieutenant."
<http://www.eviloverlord.com/>

Nov 22 '05 #13

Ed L.

On Tuesday February 10 2004 10:26, Tom Lane wrote:

Giving PG half the RAM is counterproductive no matter what --- that
pretty much guarantees that every page that's in RAM will be in RAM
twice, once in PG buffers and once in kernel buffers. The two
reasonable schools of thought are (1) to make PG buffers relatively
small and let the kernel do the bulk of the buffering, or (2) to give
PG most of the RAM and expect it to do the bulk of the buffering.

Past experience has been that theory (1) generally wins. Jan's recent
work on ARC buffering may make things more interesting, though.

So, pursuing theory (1) in 'the time before ARC', assuming you have a
dedicated box with little or no non-db competition for resources, why give
PG anymore than is absolutely required (max_connections*2 or 16)? Why not
just let the kernel reign as completely as possible? Is that what you mean
by "relatively small"?

TIA.
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 22 '05 #14

Tom Lane

"Ed L." <pg***@bluepolka.net> writes:

So, pursuing theory (1) in 'the time before ARC', assuming you have a
dedicated box with little or no non-db competition for resources, why give
PG anymore than is absolutely required (max_connections*2 or 16)? Why not
just let the kernel reign as completely as possible? Is that what you mean
by "relatively small"?

Well, if you go *really* small then you find a lot of CPU time gets
wasted shuffling data from kernel cache to PG cache. The sweet spot
for theory (1) seems to be to set shared_buffers in the range of 1000 to
10000 buffers. (Time was that that was a serious amount of RAM, but
not any more...)

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 22 '05 #15

NTPT

I make some test on my production system while maintance time so no user use the system.It run on old box with k6-2/500 and 3* 128mb DIMMs pg 7.3.2
I have table with around 100 000 lines like this : owner int8,id int8, content text , index on column owner (this index is used by text query).

i manage a test query select * from table where owner='10000' order by id oddset 100 limit 50 . There is around 10 000 lines before limit and offset apply. i run explain analyze and got this

Case 1: (no memory to Pg )
sort_mem = 64Kb
effective_cache_size = 64 (8k blocks - Around 512 kb )

Take 3300 ms
Case 2: (give a lot of ram to postgres)
sort_mem = 64000 Kb
effective_cache_size = 8192 (8k blocks)

Swapping occured , execution take 5400 ms
Case 3:
sort_mem = 16000 Kb
effective_cache_size = 512 (8k blocks - Around 4 Mb )

Take 1200 ms. The best result.
Case 4:
sort_mem = 16000 Kb
effective_cache_size = 64 (8k blocks - Around 512 kb )

Take 1900 ms.. In this case i try to increase effective_cache_size step by step 64,128,256,512,1024 but increase effective_cache_size up from 512 have no dramatic impact on performance.
I test shared_buffer settings, between 2*max_connection ie 128*2 to 1024 but have no dramatic impact on performance, but in my tests no focus on this
It seems that :

1: Settings memory limits too high, whnen machine start using swap space is WROSE then give postgres as low memory as possible.

2: settings of sort_mem have bigger impact on performance then settings of effective_cache_size , if db cache hold at least hunderds of disk pages. More than 1000 disk pages in effective cache size have no sense.

I can not made reliable simulations with this database on real user load :(, so with more concurent user result may be different. May be a good "stress test" suite for postgresql database is needed for future testings.

----- PÅ¯vodnÃ* zprÃ¡va -----
Od: "Ed L." <pg***@bluepolka.net>
Komu: "Tom Lane" <tg*@sss.pgh.pa.us>
Kopie: "Martijn van Oosterhout" <kl*****@svana.org>; <pg***********@postgresql.org>
OdeslÃ¡no: 11. Ãºnora 2004 7:01
PÅ™edmÄ›t: Re: [GENERAL] DB cache size strategies

On Tuesday February 10 2004 10:26, Tom Lane wrote:

Giving PG half the RAM is counterproductive no matter what --- that
pretty much guarantees that every page that's in RAM will be in RAM
twice, once in PG buffers and once in kernel buffers. The two
reasonable schools of thought are (1) to make PG buffers relatively
small and let the kernel do the bulk of the buffering, or (2) to give
PG most of the RAM and expect it to do the bulk of the buffering.

Past experience has been that theory (1) generally wins. Jan's recent
work on ARC buffering may make things more interesting, though.

So, pursuing theory (1) in 'the time before ARC', assuming you have a
dedicated box with little or no non-db competition for resources, why give
PG anymore than is absolutely required (max_connections*2 or 16)? Why not
just let the kernel reign as completely as possible? Is that what you mean
by "relatively small"?

TIA.
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 22 '05 #16

Richard Huxton

On Wednesday 11 February 2004 09:42, NTPT wrote:

It seems that :

1: Settings memory limits too high, whnen machine start using swap space
is WROSE then give postgres as low memory as possible.
Swapping is always bad news.
2: settings of sort_mem have bigger impact on performance then settings
of effective_cache_size , if db cache hold at least hunderds of disk
pages. More than 1000 disk pages in effective cache size have no sense.

I can not made reliable simulations with this database on real user load
:(, so with more concurent user result may be different. May be a good
"stress test" suite for postgresql database is needed for future testings.

It will be radically different with concurrent users. Your large sort_mem will
apply to every sort they do and so you will probably end up going into swap
again.

This is one of the reasons why performance tuning is complicated, and you need
to test a real-world workload wherever possible.
--
Richard Huxton
Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 22 '05 #17

Andrew Sullivan

On Wed, Feb 11, 2004 at 12:26:06AM -0500, Tom Lane wrote:

Giving PG half the RAM is counterproductive no matter what --- that
pretty much guarantees that every page that's in RAM will be in RAM
twice, once in PG buffers and once in kernel buffers. The two

Well, unless you're using an OS which allows you to set a nocache
option for a filesystem (Solaris allows this on UFS, for instance).
But the kernel cache, in my experience, is _still_ the winner,
because the LRU database cache isn't real efficient at large sizes.

--
Andrew Sullivan

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 22 '05 #18

scott.marlowe

On Wed, 11 Feb 2004, NTPT wrote:

Take 1900 ms.. In this case i try to increase effective_cache_size step
by step 64,128,256,512,1024 but increase effective_cache_size up from
512 have no dramatic impact on performance.
Note that effective_cache_size ONLY affects the query plan chosen. It has
no effect on the actual size of the kernel or postgresql caches. It is
just a hint to the planner about how large the kernel cache is.
I test shared_buffer settings, between 2*max_connection ie 128*2 to
1024 but have no dramatic impact on performance, but in my tests no
focus on this

If the data you're tossing around is ~1 Megabyte, then having more than 1
megabyte in shared buffers won't help. Basically, if shared buffers is
smaller than the data you're throwing about, then increasing it tends to
help, if it's larger than the amount of data being pulled / joined / etc..
then it doesn't help to make it larger.
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 22 '05 #19

Ed L.

On Tuesday February 10 2004 11:17, Tom Lane wrote:

Well, if you go *really* small then you find a lot of CPU time gets
wasted shuffling data from kernel cache to PG cache. The sweet spot
for theory (1) seems to be to set shared_buffers in the range of 1000 to
10000 buffers. (Time was that that was a serious amount of RAM, but
not any more...)

In theory (1), any consensus on what criteria one ought to use when deciding
between 1000 and 10000?

In general, would it be true to say that if one does *not* anticipate
contention for kernel disk cache space from non-DB processes (e.g., the
dedicated db server), then you probably want to use theory (1)? If one
*does* anticipate such contention (e.g. the all-on-one-box web-db app),
then you probably want to use theory (2) in order to ensure an adequate
cache?

Ed

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 22 '05 #20

Tom Lane

"Ed L." <pg***@bluepolka.net> writes:

In general, would it be true to say that if one does *not* anticipate
contention for kernel disk cache space from non-DB processes (e.g., the
dedicated db server), then you probably want to use theory (1)? If one
*does* anticipate such contention (e.g. the all-on-one-box web-db app),
then you probably want to use theory (2) in order to ensure an adequate
cache?

If the box is doing other things then I think you definitely want to
keep shared_buffers pretty small, relying on the kernel to allocate the
available resources as best it can.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 22 '05 #21

Tom Lane

"Ed L." <pg***@bluepolka.net> writes:

Then what scenarios, if any, merit theory (2) over theory (1)?

I'd only consider a large-cache setting on a machine that's dedicated to
running the database (where "dedicated" means "that's the only thing you
care about performance of", as in your first scenario). Even then I'd
test it against the other way. As Andrew Sullivan notes nearby, our
experience has been that the PostgreSQL buffer manager isn't all that
efficient about managing large caches. It's possible that Jan's current
work will change that situation in 7.5, but I'd still test first ...

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 22 '05 #22

Ed L.

On Wednesday February 11 2004 9:18, Tom Lane wrote:

"Ed L." <pg***@bluepolka.net> writes:
In general, would it be true to say that if one does *not* anticipate
contention for kernel disk cache space from non-DB processes (e.g., the
dedicated db server), then you probably want to use theory (1)? If one
*does* anticipate such contention (e.g. the all-on-one-box web-db app),
then you probably want to use theory (2) in order to ensure an adequate
cache?

If the box is doing other things then I think you definitely want to
keep shared_buffers pretty small, relying on the kernel to allocate the
available resources as best it can.

Then what scenarios, if any, merit theory (2) over theory (1)? I can think
of two where I'd expect that to be the case.

First, suppose DB processes are more important (and thus more deserving of
cache) than other processes in an all-on-one-box case. For example, the
non-DB processes consisted strictly of various non-performance-critical
programs to analyze large log files, etc. Would it then make sense to use
theory (2) to ensure those non-critical programs do not inadvertently
increase I/O for the DB processes?

Second, suppose we have 2 clusters running on a dedicated DB server, each
with a large enough dataset to cause the other's data to be completely
crowded out of the kernel cache during backups under theory (1), causing a
lot of disk I/O for the other as the other gradually reloads. If we use
theory (2), allocating roughly half of available RAM to each DB cache
(minus breathing room for admin, OS), I would expect that over time, the
entire DB dataset for each cluster would work its way into each cluster's
DB cache and be largely immune to the activities of the other cluster.
We'd consider that a good thing. Would this be an appropriate scenario for
theory (2)?

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 22 '05 #23

Ed L.

On Wednesday February 11 2004 9:57, Tom Lane wrote:

"Ed L." <pg***@bluepolka.net> writes:
Then what scenarios, if any, merit theory (2) over theory (1)?

I'd only consider a large-cache setting on a machine that's dedicated to
running the database (where "dedicated" means "that's the only thing you
care about performance of", as in your first scenario). Even then I'd
test it against the other way. As Andrew Sullivan notes nearby, our
experience has been that the PostgreSQL buffer manager isn't all that
efficient about managing large caches. It's possible that Jan's current
work will change that situation in 7.5, but I'd still test first ...

Great. Thx to all for feedback, very informative, interesting, and helpful
in practice.

Ed

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 22 '05 #24

Martijn van Oosterhout

On Wed, Feb 11, 2004 at 09:47:35AM -0700, Ed L. wrote:

First, suppose DB processes are more important (and thus more deserving of
cache) than other processes in an all-on-one-box case. For example, the
non-DB processes consisted strictly of various non-performance-critical
programs to analyze large log files, etc. Would it then make sense to use
theory (2) to ensure those non-critical programs do not inadvertently
increase I/O for the DB processes?
Are you assuming that shared memory will not be swapped out? If you allocate
lots of the shared memory to the cluster that it's not going to used anyway
most of the time, you're just going to find it's being swapped.
Second, suppose we have 2 clusters running on a dedicated DB server, each
with a large enough dataset to cause the other's data to be completely
crowded out of the kernel cache during backups under theory (1), causing a
lot of disk I/O for the other as the other gradually reloads. If we use
theory (2), allocating roughly half of available RAM to each DB cache
(minus breathing room for admin, OS), I would expect that over time, the
entire DB dataset for each cluster would work its way into each cluster's
DB cache and be largely immune to the activities of the other cluster.
We'd consider that a good thing. Would this be an appropriate scenario for
theory (2)?
I'd expect the kernel the swap out enough of the shared memory in each
cluster to match the actual working set. Although, I imagine PostgreSQL's
buffer management may affect the way it's gets swapped. I have no idea
whether PostgreSQL's buffer management code does a lot of page touching.

Note, some OS's don't swap out shared memory, so the above doesn't apply to
them.

--
Martijn van Oosterhout <kl*****@svana.org> http://svana.org/kleptog/ (... have gone from d-i being barely usable even by its developers
anywhere, to being about 20% done. Sweet. And the last 80% usually takes
20% of the time, too, right?) -- Anthony Towns, debian-devel-announce

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQFAKo4FY5Twig3Ge+YRAvkMAKC0mNBg4P6Tr3Rp6anJua Yal9qkNwCdHK6A
bdzipH++nDu2VnFCLAhhP5Y=
=0vKr
-----END PGP SIGNATURE-----

Nov 22 '05 #25

Rick Gigger

Has anyone discussed adding to postgres the ability to figure this out
on it's own. Couldn't it gather some statistics about the kind of
resources that it is actually using and adjust accordingly. You could
give it a max amount to use for the shared buffers but if it was so high
that it degraded performance postgres could just cut back on what it
actually used.

Is this even feasile? Correct me if I am wrong but it seems that most
other dbs seem to work this way.

It would make installing a nice tuned postgres a much more turn key
operation.

rick
Ed L. wrote:

On Wednesday February 11 2004 9:57, Tom Lane wrote:
"Ed L." <pg***@bluepolka.net> writes:
Then what scenarios, if any, merit theory (2) over theory (1)?

I'd only consider a large-cache setting on a machine that's dedicated to
running the database (where "dedicated" means "that's the only thing you
care about performance of", as in your first scenario). Even then I'd
test it against the other way. As Andrew Sullivan notes nearby, our
experience has been that the PostgreSQL buffer manager isn't all that
efficient about managing large caches. It's possible that Jan's current
work will change that situation in 7.5, but I'd still test first ...

Great. Thx to all for feedback, very informative, interesting, and helpful
in practice.

Ed

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 22 '05 #26

Richard Huxton

On Wednesday 11 February 2004 21:40, Rick Gigger wrote:

Has anyone discussed adding to postgres the ability to figure this out
on it's own. Couldn't it gather some statistics about the kind of
resources that it is actually using and adjust accordingly. You could
give it a max amount to use for the shared buffers but if it was so high
that it degraded performance postgres could just cut back on what it
actually used.

Is this even feasile? Correct me if I am wrong but it seems that most
other dbs seem to work this way.

It would make installing a nice tuned postgres a much more turn key
operation.

What if making the DB run faster makes everything else run slower? How does it
know whether 0.1sec i/o time was just it, or if there was another process
contending for disk access?

Then again, maybe interactive speed isn't important, but your bulk update is.
Or, perhaps your report can wait, but a realtime response is vital.

--
Richard Huxton
Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 22 '05 #27

Similar topics