469,576 Members | 1,824 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,576 developers. It's quick & easy.

Backup using prefetch size for reading.

Hi,

I have a quick question, when one sets the prefetch size = extent size, then
when doing a backup we will have 1 agent (db2bm) doing the reads. If we have
prefetch size a multiple of extent size AND we have multiple containers
and/or PARALLEL_IO set then not db2bm but the db2 prefetchers will read the
data.
Now no matter which one reads the data, it will still be read in 1 extent
size. Wouldn't it thus be much more efficient to always use the the
prefetchers to read the data?

For example,
A) 2 containers, prefetch = extent: db2bm will read extent size from
container 1, then extent size from container 2, .....

B) 2 containers, prefetch > extent: 2 prefetchers will read 1 extent
size, 1 from container 1 and the other one from container 2.

Seems to me that B is more efficient. But since this is a OLTP database, we
have prefetch size = extent size. Would the logic then be better to state:

A) 1 container, prefetch = extent: db2bm will read extent size from
container 1
B) 2 containers, prefetch = extent: 2 prefetchers will read 1 extent
size, one from container 1 and the other one from container 2.

The limit could then maybe set as follow for DB2_PARALLEL_IO, 1 container:
# of extent prefetching = # IO servers / backup PARALLELISM

i.e.:
10 IO servers, PARALLELISM 2, 1 container: Each db2bm will request 5
extents
The limit could then maybe set as follow for DB2_PARALLEL_IO, 2 containers:
# of extent prefetching = # containers

i.e.:
10 IO servers, PARALLELISM 2, 2 container: Each db2bm will request 2
extents
Or, instead of having the db2bm use pread to get the data, to use
asynchronous IO to read the data (just like for example the log writer in V8
uses asynchronous IO to write log pages).

Thanks.
Nov 12 '05 #1
4 3495
A few parameters need to be clarified here:
1) DB2_PARALLEL_IO {ALL or tbspaceid, tbspaceid, ...} is an environment
profile variable set with the db2set command. It is effective only for
environments where your data is on a RAID type device and a single container
has been dfined for the tblspc.; nowhere else. It does not specifies the
number of containers or extents. When you look at the command, it
identifies which tablespace is defined on a RAID type device. If the
tablespace is not defined on a RAID and ALL is used, the variable is ignored
for that tbspace.
2) For backup purposes, I don't believe that prefetchers are used (set with
the dbh cfg parm. num_ioservers), I may be wrong here, I don't quite
remember. There's a PARALLELISM parm. with the backup command that
specifies the number of tblspcs. to be backed up in parallel.. Default is
one by one, starting with tblspaceid 0.
3) When running queries, the optimizer will look at your tblspc. defs. to
find out EXTENT size, PREFETCH size and no. of containers. If you run an
explain on the query, it will show prefetch enabled.
So, if extent size is 32 pages, containers=3, then your DB2 Admin. Guide
recommends that prefetch size should be set to: nmuber of containers *
extent size. So set to 96 pages.
Given that, then your three extents could be read in parallel on a table
scan. This is what enables DB2 to pre-load the required rows in the buffer
pool.
If prefetch is not enabled and/or your IO request is a single page, as in
OLTP, then your agent does the IO, not db2 and prefetchers (or IO_SERVERS).
4) As each query that is prefetch enabled issues itsIO request, it is queued
and the num_ioservers specifies the number of processes that can be
dispatched in parallel to service any or all IO requests from any or all
queries. The service is FIFO.
HTH, Pierre.

"Erik Hendrix" <he**********@hotmail.com> a écrit dans le message de
news:2d******************************@news.teranew s.com...
Hi,

I have a quick question, when one sets the prefetch size = extent size, then when doing a backup we will have 1 agent (db2bm) doing the reads. If we have prefetch size a multiple of extent size AND we have multiple containers
and/or PARALLEL_IO set then not db2bm but the db2 prefetchers will read the data.
Now no matter which one reads the data, it will still be read in 1 extent
size. Wouldn't it thus be much more efficient to always use the the
prefetchers to read the data?

For example,
A) 2 containers, prefetch = extent: db2bm will read extent size from
container 1, then extent size from container 2, .....

B) 2 containers, prefetch > extent: 2 prefetchers will read 1 extent
size, 1 from container 1 and the other one from container 2.

Seems to me that B is more efficient. But since this is a OLTP database, we have prefetch size = extent size. Would the logic then be better to state:

A) 1 container, prefetch = extent: db2bm will read extent size from
container 1
B) 2 containers, prefetch = extent: 2 prefetchers will read 1 extent
size, one from container 1 and the other one from container 2.

The limit could then maybe set as follow for DB2_PARALLEL_IO, 1 container:
# of extent prefetching = # IO servers / backup PARALLELISM

i.e.:
10 IO servers, PARALLELISM 2, 1 container: Each db2bm will request 5
extents
The limit could then maybe set as follow for DB2_PARALLEL_IO, 2 containers: # of extent prefetching = # containers

i.e.:
10 IO servers, PARALLELISM 2, 2 container: Each db2bm will request 2
extents
Or, instead of having the db2bm use pread to get the data, to use
asynchronous IO to read the data (just like for example the log writer in V8 uses asynchronous IO to write log pages).

Thanks.

Nov 12 '05 #2
Pierre Saint-Jacques <se*****@invalid.net> wrote:
2) For backup purposes, I don't believe that prefetchers are used (set with
the dbh cfg parm. num_ioservers), I may be wrong here, I don't quite
remember. There's a PARALLELISM parm. with the backup command that
specifies the number of tblspcs. to be backed up in parallel..
The Backup utility will use the prefetchers if doing so is appropriate
for the tablespace in question. (The PARALLELISM parameter will determine
how many db2bm processes run simultaneously, and the actual I/Os to the
database will be done either by db2bm or the prefetchers.) We have seen
cases where users would increase the backup parallelism to a large
number and have their backup performance go down because they were starved
for prefetchers, fwiw.
Default is one by one, starting with tblspaceid 0.


This will change in v81FP4. When PARALLELISM > 1, tablespaces will be
backed up in order of largest to smallest.

dave
Nov 12 '05 #3
Thanks very much for bringing me up to date.
These are extremely useful as I've had quite a few questions from other
people on this.
Merci, Pierre.
"David Mooney" <dm*@vnet.ibm.com> a écrit dans le message de
news:bu**********@hanover.torolab.ibm.com...
Pierre Saint-Jacques <se*****@invalid.net> wrote:
2) For backup purposes, I don't believe that prefetchers are used (set with the dbh cfg parm. num_ioservers), I may be wrong here, I don't quite
remember. There's a PARALLELISM parm. with the backup command that
specifies the number of tblspcs. to be backed up in parallel..


The Backup utility will use the prefetchers if doing so is appropriate
for the tablespace in question. (The PARALLELISM parameter will determine
how many db2bm processes run simultaneously, and the actual I/Os to the
database will be done either by db2bm or the prefetchers.) We have seen
cases where users would increase the backup parallelism to a large
number and have their backup performance go down because they were starved
for prefetchers, fwiw.
Default is one by one, starting with tblspaceid 0.


This will change in v81FP4. When PARALLELISM > 1, tablespaces will be
backed up in order of largest to smallest.

dave


Nov 12 '05 #4
Hey David,

Thanks for the information on the order. When tuning/monitoring my backups I
did feel that it would be so much better to go from biggest to smallest
since the parallelism works on a tablespace level. It's nice to know this is
in FP4 now.
"David Mooney" <dm*@vnet.ibm.com> wrote in message
news:bu**********@hanover.torolab.ibm.com...
Pierre Saint-Jacques <se*****@invalid.net> wrote:
2) For backup purposes, I don't believe that prefetchers are used (set with the dbh cfg parm. num_ioservers), I may be wrong here, I don't quite
remember. There's a PARALLELISM parm. with the backup command that
specifies the number of tblspcs. to be backed up in parallel..


The Backup utility will use the prefetchers if doing so is appropriate
for the tablespace in question. (The PARALLELISM parameter will determine
how many db2bm processes run simultaneously, and the actual I/Os to the
database will be done either by db2bm or the prefetchers.) We have seen
cases where users would increase the backup parallelism to a large
number and have their backup performance go down because they were starved
for prefetchers, fwiw.
Default is one by one, starting with tblspaceid 0.


This will change in v81FP4. When PARALLELISM > 1, tablespaces will be
backed up in order of largest to smallest.

dave

Nov 12 '05 #5

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

2 posts views Thread by nt | last post: by
6 posts views Thread by Eric Herber | last post: by
9 posts views Thread by Stanley Sinclair | last post: by
6 posts views Thread by Uthuras | last post: by
4 posts views Thread by Hardy | last post: by
4 posts views Thread by uthuras | last post: by
2 posts views Thread by rdemyan via AccessMonster.com | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.