I have six hard drives (4 SCSI and 2 EIDE) on my main machine with parts
of a database on each drive. The main index is on one SCSI drive all to
itself. The main data are on the other three SCSI drives. Small relations
are on one EIDE drive, and the logfiles are on the other EIDE drive. When
running the task, below, the rest of the machine is not doing much.
I do not remember where I saw it, but somewhere I got the idea that the
number of page cleaners might well be one more than the number of hard
drives, so I setup my database with 7 page cleaners (and 7 page fetchers).
Now a big (for me) task gets into a state where, on Linux, all seven page
cleaners are in the D state. It is all working, but slow. vmstat reveals
that the IO is going no where near the maximum rate that the IO channels
would support (about 6 Megabytes per second, even though the IO system can
easily do 36 Megabytes per second (e.g., when doing a database reorg); the
SCSI drives are 10,000rpm Ultra/320 SCSI drives and the SCSI controller is
on a PCI-X bus all its own.). The problem pretty clearly is the time the
drives spend seeking. Now perhaps I could increase the size of the buffer
pools, move stuff around on the drives, etc. In fact, the big hog is the
logging.
But what I would like to know is: is there any point in using more page
cleaners than I do now? Is there a better rule than the one I indicated above?
--
.~. Jean-David Beyer Registered Linux User 85642.
/V\ PGP-Key: 9A2FC99A Registered Machine 241939.
/( )\ Shrewsbury, New Jersey http://counter.li.org
^^-^^ 15:40:01 up 17 days, 5:00, 6 users, load average: 4.10, 4.16, 4.12 7 2068
Jean-David Beyer wrote: I have six hard drives (4 SCSI and 2 EIDE) on my main machine with parts of a database on each drive. The main index is on one SCSI drive all to itself. The main data are on the other three SCSI drives. Small relations are on one EIDE drive, and the logfiles are on the other EIDE drive. When running the task, below, the rest of the machine is not doing much.
[...] In fact, the big hog is the logging.
You should place the log on the fastest drive you have.
--
Knut Stolze
Information Integration
IBM Germany / University of Jena
Knut Stolze wrote: Jean-David Beyer wrote:
I have six hard drives (4 SCSI and 2 EIDE) on my main machine with parts of a database on each drive. The main index is on one SCSI drive all to itself. The main data are on the other three SCSI drives. Small relations are on one EIDE drive, and the logfiles are on the other EIDE drive. When running the task, below, the rest of the machine is not doing much.
[...]
In fact, the big hog is the logging.
You should place the log on the fastest drive you have.
I am sure you are right. But that does not answer the question on the
proper ratio of page cleaners to drive spindles.
--
.~. Jean-David Beyer Registered Linux User 85642.
/V\ PGP-Key: 9A2FC99A Registered Machine 241939.
/( )\ Shrewsbury, New Jersey http://counter.li.org
^^-^^ 21:25:00 up 17 days, 10:45, 3 users, load average: 4.33, 4.20, 4.11
Jean-David Beyer wrote: Knut Stolze wrote:
Jean-David Beyer wrote:
I have six hard drives (4 SCSI and 2 EIDE) on my main machine with parts of a database on each drive. The main index is on one SCSI drive all to itself. The main data are on the other three SCSI drives. Small relations are on one EIDE drive, and the logfiles are on the other EIDE drive. When running the task, below, the rest of the machine is not doing much.
[...]
In fact, the big hog is the logging. You should place the log on the fastest drive you have. I am sure you are right. But that does not answer the question on the proper ratio of page cleaners to drive spindles.
Can't answer that either (maybe Matt can jump in). But if you say the
log is the issue then the page cleaners won't help I guess.
BTW, would the configuration assistant be of any help?
Cheers
Serge
Serge Rielau wrote: Jean-David Beyer wrote:
Knut Stolze wrote:
Jean-David Beyer wrote:
I have six hard drives (4 SCSI and 2 EIDE) on my main machine with parts of a database on each drive. The main index is on one SCSI drive all to itself. The main data are on the other three SCSI drives. Small relations are on one EIDE drive, and the logfiles are on the other EIDE drive. When running the task, below, the rest of the machine is not doing much. [...]
In fact, the big hog is the logging.
You should place the log on the fastest drive you have. I am sure you are right. But that does not answer the question on the proper ratio of page cleaners to drive spindles. Can't answer that either (maybe Matt can jump in). But if you say the log is the issue then the page cleaners won't help I guess.
I guess so, too, but if all 7 are in D-state, they are a bottleneck. Now
if they are all in D-state because they are waiting for disk seeks, would
throughput improve with more of them?
BTW, would the configuration assistant be of any help?
I do not know what the configuration assistant is. I tried to run it, and
it wants:
$ db2ca
sh: line 1: /opt/IBMJava2-131/jre/bin/java: No such file or directory
DB2JAVIT : RC = 127
This is true enough. I would have thought a Java around a year old should
be old enough to use with the current product. Guess not.
trillian:jdbeyer[~]$ ls -l /opt/
total 48
drwxr-xr-x 3 root root 4096 Jan 1 15:06 IBM
drwxr-xr-x 8 root root 4096 Mar 1 2004 IBMJava2-141 <---<<<
drwxr-xr-x 2 root root 4096 Jan 1 13:30 IBM_PDF_DOCS
On my old machine, I found the Java stuff ran so slowly that it is
useless, so I either use C++ programs or the command line interpreter.
That was on a machine with 512Meg RAM and two Ultra/2 10,000rpm SCSI hard
drives and two 550MHz Pentium IIIs.
This machine has 4GBytes RAM, 2 Hyperthreaded 3.06GHz Xeons with 1
Megabyte L3 cache, four Ultra/320 10,000rpm SCSI hard drives. But I
haven't tried that stuff. I find the documentation almost impossible to
use with DB2 V8.1.?. Before I used DB2 V6.1.? and the documentation was
pretty clear.
--
.~. Jean-David Beyer Registered Linux User 85642.
/V\ PGP-Key: 9A2FC99A Registered Machine 241939.
/( )\ Shrewsbury, New Jersey http://counter.li.org
^^-^^ 21:50:00 up 17 days, 11:10, 3 users, load average: 4.31, 4.12, 4.09
"Jean-David Beyer" <jd*****@exit109.com> wrote in message
news:10*************@corp.supernews.com... Serge Rielau wrote: Jean-David Beyer wrote: Knut Stolze wrote: Jean-David Beyer wrote:
> I have six hard drives (4 SCSI and 2 EIDE) on my main machine with > parts > of a database on each drive. The main index is on one SCSI drive all
to> itself. The main data are on the other three SCSI drives. Small > relations > are on one EIDE drive, and the logfiles are on the other EIDE drive. > When > running the task, below, the rest of the machine is not doing much.
[...]
> In fact, the big hog is the logging.
You should place the log on the fastest drive you have.
I am sure you are right. But that does not answer the question on the proper ratio of page cleaners to drive spindles. Can't answer that either (maybe Matt can jump in). But if you say the log is the issue then the page cleaners won't help I guess.
<jumping in ; re-stating system details for context>
I didn't see what DB2 version you are using, so I'm going to give a general
answer with a bit of historical context, assuming v7 and v8 UDB.
I do not remember where I saw it, but somewhere I got the idea that the number of page cleaners might well be one more than the number of hard drives, so I setup my database with 7 page cleaners (and 7 page fetchers).
IIRC, this was the recommendation provided in the v7 performance tuning
guide.
Prior to v8, all page cleaner I/O was synchronous -- meaning that while an
I/O (write) was in progress, no other cleaning-related activity (aging
pages, figuring out what page to write next, etc) was being done by that
cleaner process. Thus, the recommendation was to have one cleaner per
physical disk (spindle), in order to keep all the drives busy with cleaning
activity.
In v8, we starting using AIO (asynchronous I/O), which allows each cleaner
process to fire off a batch of I/Os to the OS, and then go off and do
housekeeping work (aging pages and determining the next batch of pages to
clean) while the current batch of I/Os are in progress. This is a much more
efficient use of CPU (less context-switching and pre-emption), and scales
much better (especially on large systems with many arrays of disks.)
In v8, the rule of thumb is to use one cleaner per CPU.
Now a big (for me) task gets into a state where, on Linux, all seven page cleaners are in the D state. It is all working, but slow. vmstat reveals that the IO is going no where near the maximum rate that the IO channels would support (about 6 Megabytes per second, even though the IO system can easily do 36 Megabytes per second (e.g., when doing a database reorg); the SCSI drives are 10,000rpm Ultra/320 SCSI drives and the SCSI controller is on a PCI-X bus all its own.). The problem pretty clearly is the time the drives spend seeking. Now perhaps I could increase the size of the buffer pools, move stuff around on the drives, etc. In fact, the big hog is the logging.
Note that the max throughput of your devices is based on sequential access.
Page cleaner access is usually quite random, due to the way pages are hashed
into slots in the bufferpool and the aging characteristics of pages based on
the workload. I'd say a 16% throughput rate is "pretty good" for a random
write workload.
[ Aside: To get a rough estimate of what throughput to expect on a random
workload, take the track-to-track seek time, and divide it by the average
seek time. This represents the access time ratio between a sequential
workload (which will seek to an adjacent track) and a random workload (which
will seek halfway across the disk, on average.) Apply this ratio to your
rated [sequential] throughput and you will have a good estimate of what
throughput to expect for a random workload. Scale as neccessary based on
read/write mix and sequential/random properties thereof. This usually works
out to be somewhere between 1/10 (10%) and 1/7 (14%). ]
Also, many of the memory structures (lists) used internally by the
bufferpool are based on the number of cleaners. One of the things we've
seen migrating workloads from v7 to v8 is that reducing the number of
cleaners can actually improve performance. Part of this is due to reduced
CPU contention (having 7 cleaners on a 4-way in v8 means that they will be
competing for CPU; whereas using 4 will promote fair sharing), and the async
nature of the cleaners in v8 allows the OS to make better decisions with
scheduling I/Os, as each cleaner will batch off N I/Os at the same time.
Hopefully this answers the posters questions.
--
Matt Emmerton
IBM Toronto Software Lab
Matt Emmerton wrote:
[snip] <jumping in ; re-stating system details for context>
I didn't see what DB2 version you are using, so I'm going to give a general answer with a bit of historical context, assuming v7 and v8 UDB.
I think I said, but if not, I am running DB2 UDB V8.1.7 at the moment on a
machine with two 3.06 GHz hyperthreaded Intel Xeon processors -- the ones
with 1 MByte L3 caches (Linux thinks there are four processors). I have 4
GBytes RAM. In the database of interest, most of the relations and their
indices are DMS. I do not remember where I saw it, but somewhere I got the idea that the number of page cleaners might well be one more than the number of hard drives, so I setup my database with 7 page cleaners (and 7 page fetchers). IIRC, this was the recommendation provided in the v7 performance tuning guide.
Prior to v8, all page cleaner I/O was synchronous -- meaning that while an I/O (write) was in progress, no other cleaning-related activity (aging pages, figuring out what page to write next, etc) was being done by that cleaner process. Thus, the recommendation was to have one cleaner per physical disk (spindle), in order to keep all the drives busy with cleaning activity.
In v8, we starting using AIO (asynchronous I/O), which allows each cleaner process to fire off a batch of I/Os to the OS, and then go off and do housekeeping work (aging pages and determining the next batch of pages to clean) while the current batch of I/Os are in progress. This is a much more efficient use of CPU (less context-switching and pre-emption), and scales much better (especially on large systems with many arrays of disks.)
In v8, the rule of thumb is to use one cleaner per CPU.
So I might get better performance by reducing the number of page cleaners
(and probably page fetchers, though they are not a bottleneck) from the 7
I have now to 4 each, right? Now a big (for me) task gets into a state where, on Linux, all seven page cleaners are in the D state. It is all working, but slow. vmstat reveals that the IO is going no where near the maximum rate that the IO channels would support (about 6 Megabytes per second, even though the IO system can easily do 36 Megabytes per second (e.g., when doing a database reorg); the SCSI drives are 10,000rpm Ultra/320 SCSI drives and the SCSI controller is on a PCI-X bus all its own.). The problem pretty clearly is the time the drives spend seeking. Now perhaps I could increase the size of the buffer pools, move stuff around on the drives, etc. In fact, the big hog is the logging.
Note that the max throughput of your devices is based on sequential access. Page cleaner access is usually quite random, due to the way pages are hashed into slots in the bufferpool and the aging characteristics of pages based on the workload. I'd say a 16% throughput rate is "pretty good" for a random write workload.
[ Aside: To get a rough estimate of what throughput to expect on a random workload, take the track-to-track seek time, and divide it by the average seek time. This represents the access time ratio between a sequential workload (which will seek to an adjacent track) and a random workload (which will seek halfway across the disk, on average.) Apply this ratio to your rated [sequential] throughput and you will have a good estimate of what throughput to expect for a random workload. Scale as neccessary based on read/write mix and sequential/random properties thereof. This usually works out to be somewhere between 1/10 (10%) and 1/7 (14%). ]
According to Maxtor for the SCSI drives, seek times are
track-to-track: 0.3ms
average seek: 4.5ms
full stroke: 11. ms
So that comes to 0.3/4.5 = 0.0667 or about 7%.
Now the disk controllers can do 320MBytes/sec advertizing maximum, though
I to not recall seeing much over 55MBytes/sec in actual use over an
extended (> 1 second) period of time. When running a REORGANIZE DATABASE
on a DB2 database, I see 35 MBytes/second for extended (many seconds)
periods. But when just populating a database with INSERTs, it runs 6 to 12
MBytes/sec.
The low number is when setiathome is running at nice level 19, the high
number, when setiathome is not running. I attribute this to the fact that
DB2 must wait for IOs in this case, and while it is waiting, setiathome
dirtys up the caches in the processors so they run at RAM speed instead of
cache speed.
So without setiathome, according to this calculation, things should run at
about 21 MBytes/sec, not 12. That is less than a 50% error, so perhaps
that is as good as rules of thumb are likely to be. Also, many of the memory structures (lists) used internally by the bufferpool are based on the number of cleaners. One of the things we've seen migrating workloads from v7 to v8 is that reducing the number of cleaners can actually improve performance. Part of this is due to reduced CPU contention (having 7 cleaners on a 4-way in v8 means that they will be competing for CPU; whereas using 4 will promote fair sharing), and the async nature of the cleaners in v8 allows the OS to make better decisions with scheduling I/Os, as each cleaner will batch off N I/Os at the same time.
Hopefully this answers the posters questions.
That answers a lot of them. Thank you.
It seems, therefore, that I should reduce the number of page cleaners (and
probably page fetchers) from 7 to 4. I should also (though that would be
more difficult, take the part of the database on one of the SCSI drives
and put it onto an EIDE drive, and move the log files to a SCSI drive.
That is more difficult and I will probably do that later.
--
.~. Jean-David Beyer Registered Linux User 85642.
/V\ PGP-Key: 9A2FC99A Registered Machine 241939.
/( )\ Shrewsbury, New Jersey http://counter.li.org
^^-^^ 09:05:00 up 17 days, 22:25, 3 users, load average: 4.21, 4.21, 4.19
"Jean-David Beyer" <jd*****@exit109.com> wrote in message
news:10*************@corp.supernews.com... Matt Emmerton wrote: [snip] <jumping in ; re-stating system details for context>
I didn't see what DB2 version you are using, so I'm going to give a general answer with a bit of historical context, assuming v7 and v8 UDB. I think I said, but if not, I am running DB2 UDB V8.1.7 at the moment on a machine with two 3.06 GHz hyperthreaded Intel Xeon processors -- the ones with 1 MByte L3 caches (Linux thinks there are four processors). I have 4 GBytes RAM. In the database of interest, most of the relations and their indices are DMS. I do not remember where I saw it, but somewhere I got the idea that the number of page cleaners might well be one more than the number of hard drives, so I setup my database with 7 page cleaners (and 7 page fetchers).
IIRC, this was the recommendation provided in the v7 performance tuning guide.
Prior to v8, all page cleaner I/O was synchronous -- meaning that while an I/O (write) was in progress, no other cleaning-related activity (aging pages, figuring out what page to write next, etc) was being done by that cleaner process. Thus, the recommendation was to have one cleaner per physical disk (spindle), in order to keep all the drives busy with cleaning activity.
In v8, we starting using AIO (asynchronous I/O), which allows each cleaner process to fire off a batch of I/Os to the OS, and then go off and do housekeeping work (aging pages and determining the next batch of pages to clean) while the current batch of I/Os are in progress. This is a much more efficient use of CPU (less context-switching and pre-emption), and scales much better (especially on large systems with many arrays of disks.)
In v8, the rule of thumb is to use one cleaner per CPU.
So I might get better performance by reducing the number of page cleaners (and probably page fetchers, though they are not a bottleneck) from the 7 I have now to 4 each, right?
Correct. Anywhere between 2 and 4 cleaners (accounting for HTT) would be a
reasonable starting point on your system. Now a big (for me) task gets into a state where, on Linux, all seven page cleaners are in the D state. It is all working, but slow. vmstat reveals that the IO is going no where near the maximum rate that the IO channels would support (about 6 Megabytes per second, even though the IO system can easily do 36 Megabytes per second (e.g., when doing a database reorg); the SCSI drives are 10,000rpm Ultra/320 SCSI drives and the SCSI controller is on a PCI-X bus all its own.). The problem pretty clearly is the time the drives spend seeking. Now perhaps I could increase the size of the buffer pools, move stuff around on the drives, etc. In fact, the big hog is the logging.
Note that the max throughput of your devices is based on sequential access. Page cleaner access is usually quite random, due to the way pages are hashed into slots in the bufferpool and the aging characteristics of pages based on the workload. I'd say a 16% throughput rate is "pretty good" for a random write workload.
[ Aside: To get a rough estimate of what throughput to expect on a random workload, take the track-to-track seek time, and divide it by the average seek time. This represents the access time ratio between a sequential workload (which will seek to an adjacent track) and a random workload (which will seek halfway across the disk, on average.) Apply this ratio to your rated [sequential] throughput and you will have a good estimate of what throughput to expect for a random workload. Scale as neccessary based on read/write mix and sequential/random properties thereof. This usually works out to be somewhere between 1/10 (10%) and 1/7 (14%). ]
According to Maxtor for the SCSI drives, seek times are
track-to-track: 0.3ms average seek: 4.5ms full stroke: 11. ms
So that comes to 0.3/4.5 = 0.0667 or about 7%.
Now the disk controllers can do 320MBytes/sec advertizing maximum, though I to not recall seeing much over 55MBytes/sec in actual use over an extended (> 1 second) period of time. When running a REORGANIZE DATABASE on a DB2 database, I see 35 MBytes/second for extended (many seconds) periods. But when just populating a database with INSERTs, it runs 6 to 12 MBytes/sec.
The low number is when setiathome is running at nice level 19, the high number, when setiathome is not running. I attribute this to the fact that DB2 must wait for IOs in this case, and while it is waiting, setiathome dirtys up the caches in the processors so they run at RAM speed instead of cache speed.
So without setiathome, according to this calculation, things should run at about 21 MBytes/sec, not 12. That is less than a 50% error, so perhaps that is as good as rules of thumb are likely to be.
You also need to factor in the overhead of whatever read activity you are
doing as well -- so 12MB/sec may not be all that unreasonable.
The reason you see 35MB/sec while doing a reorganize is because DB2 does a
lot of sequential reads (which are quick) and the data that is written out
by the page cleaners is a lot less random than during a "normal" workload. Also, many of the memory structures (lists) used internally by the bufferpool are based on the number of cleaners. One of the things we've seen migrating workloads from v7 to v8 is that reducing the number of cleaners can actually improve performance. Part of this is due to reduced CPU contention (having 7 cleaners on a 4-way in v8 means that they will be competing for CPU; whereas using 4 will promote fair sharing), and the async nature of the cleaners in v8 allows the OS to make better decisions with scheduling I/Os, as each cleaner will batch off N I/Os at the same time.
Hopefully this answers the posters questions. That answers a lot of them. Thank you.
It seems, therefore, that I should reduce the number of page cleaners (and probably page fetchers) from 7 to 4. I should also (though that would be more difficult, take the part of the database on one of the SCSI drives and put it onto an EIDE drive, and move the log files to a SCSI drive. That is more difficult and I will probably do that later.
Agreed.
--
Matt Emmerton
IBM Toronto Software Lab This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Sean Berry |
last post by:
Say I have a dictionary like the following
{1:'one',2:'two',4:'three',8:'four',16:'five', etc...}
and I am given some numbers, say 22, 25, and 9. I want to determine the
keys, powers of 2,...
|
by: Darren Dale |
last post by:
Hello,
def test(data):
i = ? This is the line I have trouble with
if i==1: return data
else: return data
a,b,c,d = test()
|
by: work4u |
last post by:
My case is from the client site web page, there is a select menu, on a
form,
which dynamically show all data from a database
But the data under this select menu which need to follow a rule which...
|
by: Luca |
last post by:
I have the following problem:
I'm developing a system where there are some processes that
communicate each other via message queues; the message one process can
send to another process is as...
|
by: Mike |
last post by:
Related to another topic I just posted, I wanted to discuss ways to optimize
the validation of very large (>100MB) XML documents.
First, I have no idea if something like this already exists; it...
|
by: Pierre Saint-Jacques |
last post by:
DB2 V8.2 has a new envir. var.
DB2_USE_ALTERNATE_PAGE_CLEANING=YES
The docs. mention that this will make DB2 ignore chngpgs_thresh and use
softmax to even the rate of writing out of the bp's.
...
|
by: Fred Nelson |
last post by:
I'm devloping a Web Application in VB.NET. In my web.config file I have
specified that untrapped errors are to be sent to the page "errorpage.aspx".
This is working fine - if an untrapped error...
|
by: Now You Know |
last post by:
Carpet Cleaners Los Angeles Home Carpet Rug Upholstery Cleaning
Phone 1 310 925 1720 OR 1-818-386-1022
Local Call California Wide
We offer carpet cleaning services such as; Steam Cleaning, Dry...
|
by: Now You Know |
last post by:
Carpet Cleaners Los Angeles Home Carpet Rug Upholstery Cleaning Home
Office
Phone 1 310 925 1720 OR 1-818-386-1022
Local Call California Wide
We offer carpet cleaning services such as; Steam...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 2 August 2023 starting at 18:00 UK time (6PM UTC+1) and finishing at about 19:15 (7.15PM)
The start time is equivalent to 19:00 (7PM) in Central...
|
by: erikbower65 |
last post by:
Using CodiumAI's pr-agent is simple and powerful. Follow these steps:
1. Install CodiumAI CLI: Ensure Node.js is installed, then run 'npm install -g codiumai' in the terminal.
2. Connect to...
|
by: linyimin |
last post by:
Spring Startup Analyzer generates an interactive Spring application startup report that lets you understand what contributes to the application startup time and helps to optimize it. Support for...
|
by: erikbower65 |
last post by:
Here's a concise step-by-step guide for manually installing IntelliJ IDEA:
1. Download: Visit the official JetBrains website and download the IntelliJ IDEA Community or Ultimate edition based on...
|
by: kcodez |
last post by:
As a H5 game development enthusiast, I recently wrote a very interesting little game - Toy Claw ((http://claw.kjeek.com/))。Here I will summarize and share the development experience here, and hope it...
|
by: Taofi |
last post by:
I try to insert a new record but the error message says the number of query names and destination fields are not the same
This are my field names
ID, Budgeted, Actual, Status and Differences
...
|
by: Rina0 |
last post by:
I am looking for a Python code to find the longest common subsequence of two strings. I found this blog post that describes the length of longest common subsequence problem and provides a solution in...
|
by: DJRhino |
last post by:
Private Sub CboDrawingID_BeforeUpdate(Cancel As Integer)
If = 310029923 Or 310030138 Or 310030152 Or 310030346 Or 310030348 Or _
310030356 Or 310030359 Or 310030362 Or...
|
by: lllomh |
last post by:
How does React native implement an English player?
| |