By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
437,558 Members | 1,068 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 437,558 IT Pros & Developers. It's quick & easy.

Logging speed.

P: n/a
When initially populating a database, it runs slower that I would have
supposed. Checking here and there, especially with iostat, reveals that
most of the time is spent writting the logfiles. In other words, I am not
compute limited, but I am not IO limited either (in terms of bytes/second
to the drives). The system is in IO-WAIT state most of the time, so I
assume the process is really seek limited.

It is true that my method of populating the database leaves a lot to be
desired. The input is many hundreds of spreadsheet pages with hundreds of
columns and several thousand rows each. I do not want relations with
hundreds of columns, so to normalize them, I read a row at a time and do a
lot of inserts of subsets of the columns into different relations.

When all is done, everything is fine.

The large tables in the database, and the indices are DMS on 10,000rpm
Ultra/320 LVD SCSI hard drives. The logfiles go on a 7,200rpm 100MHz EIDE
hard drive with 8 megabyte cache in the drive. Possibly putting that onto
one of the SCSI drives might help slightly, but I doubt it because it
would have to seek from the data partition to the log partition and that
would probably lose me any savings from the increased rotational speed.

Any suggestions? Fortunately, this is not an urgent problem, but it would
be nice to understand DB2 well enough to assure myself that this is about
as fast as I should want.

--
.~. Jean-David Beyer Registered Linux User 85642.
/V\ Registered Machine 241939.
/( )\ Shrewsbury, New Jersey http://counter.li.org
^^-^^ 16:10:00 up 1 day, 7:45, 3 users, load average: 1.32, 1.14, 1.04

Nov 12 '05 #1
Share this Question
Share on Google+
9 Replies


P: n/a
In article <10*************@corp.supernews.com>, Jean-David Beyer
(jd*****@exit109.com) says...
Any suggestions? Fortunately, this is not an urgent problem, but it would
be nice to understand DB2 well enough to assure myself that this is about
as fast as I should want.


You might need to change the number of log pages in the log buffer. I
tested it some months ago just to find out how fast inserts would go
on my machine and it was about 5 times faster after changing it.
With 'db2 update db cfg for <dbname> using LOGBUFSZ <newvalue>' you
can change it, on WXP the default is 8.
Nov 12 '05 #2

P: n/a
Ian
Jean-David Beyer wrote:
When initially populating a database, it runs slower that I would have
supposed. Checking here and there, especially with iostat, reveals that
most of the time is spent writting the logfiles. In other words, I am
not compute limited, but I am not IO limited either (in terms of
bytes/second to the drives). The system is in IO-WAIT state most of the
time, so I assume the process is really seek limited.

It is true that my method of populating the database leaves a lot to be
desired. The input is many hundreds of spreadsheet pages with hundreds
of columns and several thousand rows each. I do not want relations with
hundreds of columns, so to normalize them, I read a row at a time and do
a lot of inserts of subsets of the columns into different relations.

When all is done, everything is fine.

The large tables in the database, and the indices are DMS on 10,000rpm
Ultra/320 LVD SCSI hard drives. The logfiles go on a 7,200rpm 100MHz
EIDE hard drive with 8 megabyte cache in the drive. Possibly putting
that onto one of the SCSI drives might help slightly, but I doubt it
because it would have to seek from the data partition to the log
partition and that would probably lose me any savings from the increased
rotational speed.

Any suggestions? Fortunately, this is not an urgent problem, but it
would be nice to understand DB2 well enough to assure myself that this
is about as fast as I should want.


There are lots of things you can do to affect this, such as:

1) Alter your transaction (i.e. autocommit off)
2) LOGBUFSZ (as suggested by Gert)
3) Consider LOGFILSIZ and LOGPRIMARY
4) Depending on your bufferpools, having the logs on the faster disk
may not be as much of a penalty as you think. Hopefully the data
and index writes are asynchronous.
5) Consider writing to flat files and then using LOAD.
Good luck
-----= Posted via Newsfeeds.Com, Uncensored Usenet News =-----
http://www.newsfeeds.com - The #1 Newsgroup Service in the World!
-----== Over 100,000 Newsgroups - 19 Different Servers! =-----
Nov 12 '05 #3

P: n/a
For an initial populate from an external source, you could try making
the tables NOT LOGGED INITIALLY. This will eliminate all logging.

You'll have to back up the tablespaces after load and before allowing
updates. It's unlikely that you would need the logs because an early
failure would probably require a data reload from external sources
anyway. This will require seting the NOT LOGGED attribute for all of the
tables and loading all of them in a single unit of work (no commits).

Phil Sherman
Jean-David Beyer wrote:
When initially populating a database, it runs slower that I would have
supposed. Checking here and there, especially with iostat, reveals that
most of the time is spent writting the logfiles. In other words, I am
not compute limited, but I am not IO limited either (in terms of
bytes/second to the drives). The system is in IO-WAIT state most of the
time, so I assume the process is really seek limited.

It is true that my method of populating the database leaves a lot to be
desired. The input is many hundreds of spreadsheet pages with hundreds
of columns and several thousand rows each. I do not want relations with
hundreds of columns, so to normalize them, I read a row at a time and do
a lot of inserts of subsets of the columns into different relations.

When all is done, everything is fine.

The large tables in the database, and the indices are DMS on 10,000rpm
Ultra/320 LVD SCSI hard drives. The logfiles go on a 7,200rpm 100MHz
EIDE hard drive with 8 megabyte cache in the drive. Possibly putting
that onto one of the SCSI drives might help slightly, but I doubt it
because it would have to seek from the data partition to the log
partition and that would probably lose me any savings from the increased
rotational speed.

Any suggestions? Fortunately, this is not an urgent problem, but it
would be nice to understand DB2 well enough to assure myself that this
is about as fast as I should want.


Nov 12 '05 #4

P: n/a
Gert van der Kooij wrote:
In article <10*************@corp.supernews.com>, Jean-David Beyer
(jd*****@exit109.com) says...

Any suggestions? Fortunately, this is not an urgent problem, but it would
be nice to understand DB2 well enough to assure myself that this is about
as fast as I should want.


You might need to change the number of log pages in the log buffer. I
tested it some months ago just to find out how fast inserts would go
on my machine and it was about 5 times faster after changing it.
With 'db2 update db cfg for <dbname> using LOGBUFSZ <newvalue>' you
can change it, on WXP the default is 8.


On RHEL 3 ES, the default is also 8. I raised it to 32, dropped and
re-created the database and started populating it. It seems about 12%
faster, based on the fact that chunks of work used to take slightly over 8
seconds, and now they are taking 6.6 seconds. Now the bottleneck seems to
be the index that is on a hard drive all its own. It might pay to split
the index onto two hard drives, but I am not sure how. I already have 6
hard drives on this machine, and I am concerned about seek time, since I
am by no means saturating the IO channels (Ultra/320 LVD SCSI controller
with just the four SCSI drives on it, and I was writing around 8
MBytes.sec most of the time. If I do a reorg of a database that really
needs it, I can see over 33MBytes/sec for extended periods and 50
MBytes/sec peaks.

Here is the output of iostat. I removed the reads, since it was not doing
any. And I removed some other redundant data as well to shorten the lines.

Device: w/s wkB/s avgrq-sz avgqu-sz await svctm %util
/dev/hda10 1733.24 6954.02 8.02 3.19 0.19 0.18 31.86
/dev/hdb1 0.00 0.00 0.00 0.00 0.00 0.00 0.00
/dev/sda1 26.59 106.37 8.00 0.03 0.10 0.10 0.28
/dev/sdb1 9.42 37.67 8.00 0.03 0.29 0.29 0.28
/dev/sdc1 28.53 114.13 8.00 0.17 0.58 0.58 1.66
/dev/sdd2 590.86 2364.54 8.00 96.57 16.26 2.34 138.50

hda10 has the logfiles on it.
hdb1 has unimportant small (under 10K rows) tables and indices on it.
sda1, sdb1, sdc1 have the (important: large busy) data on it.
sdd2 has the (important -large busy) indices on it.

--
.~. Jean-David Beyer Registered Linux User 85642.
/V\ Registered Machine 241939.
/( )\ Shrewsbury, New Jersey http://counter.li.org
^^-^^ 18:20:00 up 1 day, 9:55, 5 users, load average: 10.25, 10.12, 9.23

Nov 12 '05 #5

P: n/a
Ian wrote:
Jean-David Beyer wrote:
When initially populating a database, it runs slower that I would have
supposed. Checking here and there, especially with iostat, reveals
that most of the time is spent writting the logfiles. In other words,
I am not compute limited, but I am not IO limited either (in terms of
bytes/second to the drives). The system is in IO-WAIT state most of
the time, so I assume the process is really seek limited.

It is true that my method of populating the database leaves a lot to
be desired. The input is many hundreds of spreadsheet pages with
hundreds of columns and several thousand rows each. I do not want
relations with hundreds of columns, so to normalize them, I read a row
at a time and do a lot of inserts of subsets of the columns into
different relations.

When all is done, everything is fine.

The large tables in the database, and the indices are DMS on 10,000rpm
Ultra/320 LVD SCSI hard drives. The logfiles go on a 7,200rpm 100MHz
EIDE hard drive with 8 megabyte cache in the drive. Possibly putting
that onto one of the SCSI drives might help slightly, but I doubt it
because it would have to seek from the data partition to the log
partition and that would probably lose me any savings from the
increased rotational speed.

Any suggestions? Fortunately, this is not an urgent problem, but it
would be nice to understand DB2 well enough to assure myself that this
is about as fast as I should want.

There are lots of things you can do to affect this, such as:

1) Alter your transaction (i.e. autocommit off)
2) LOGBUFSZ (as suggested by Gert)


Tried this: raise LOGBUFSZ from default (8) to 32 and got about 12%
improvement.
3) Consider LOGFILSIZ and LOGPRIMARY
I have not set LOGFILSIZ and I guess the default is 4MBytes.
LOGPRIMARY is 4 right now. 4) Depending on your bufferpools, having the logs on the faster disk
may not be as much of a penalty as you think. Hopefully the data
and index writes are asynchronous.
This machine has 4096 Meg RAM and I could double it if someone sent me
$1200 (I do not expect anyone to do that), so I can have fairly large
buffer pools. I wondered about making the buffer pool for the indices
large enough to hold the entire index so after a startup phase, it would
need to do no IO for SELECTs. But in populating the thing, it seems to
want to keep the disks up to date (makes sense, after all), so it writes
all the time even though it might be able to afford to wait. But when I do
a COMMIT, it does not care to delay very much, and rightly so.

Also, how much faster are the 10,000rpm SCSI disks than the 7,200rpm EIDE
drives? Some, I would suppose. If I had the money, I might put another
Ultra/320 10,000rpm hard drive in there on the other SCSI controller, but
for a hobby database, six hard drives is already a bit ridiculous. I have
fiddled with the extent size (in range of 32 to 64 pages), and stuff like
that, but it does not seem to make much difference.

One thing I notice is that I have set it to run one more page cleaner than
I have hard drives; i.e., 7 page cleaners. When populating the database,
all of them get into D state in Linux (basically waiting for IO). It does
not seem that increasing this number would help much, since more of them
would just contend for the same drives. Is this right?
5) Consider writing to flat files and then using LOAD.
I am afraid that may be the best way but, since the system is now running
about 12x faster (from a user's standpoint) because of a much faster
machine, using DMS instead of SMS storage for most tables, using 4 hard
drives that are not being fought over by other things, and so on, it is
actually fast enough. I just want to find easy ways to tune it. If this
were a serious problem, if I had to do it often, or if I had a paying
customer, I would probably go the route you suggest of writing flat files
that look like the relations I want and use LOAD. But none of these
conditions obtain and I do not want to bother, at least for now.
Good luck
-----= Posted via Newsfeeds.Com, Uncensored Usenet News =-----
http://www.newsfeeds.com - The #1 Newsgroup Service in the World!
-----== Over 100,000 Newsgroups - 19 Different Servers! =-----

--
.~. Jean-David Beyer Registered Linux User 85642.
/V\ Registered Machine 241939.
/( )\ Shrewsbury, New Jersey http://counter.li.org
^^-^^ 18:45:00 up 1 day, 10:20, 5 users, load average: 1.02, 1.31, 3.73

Nov 12 '05 #6

P: n/a
Ensure you're using a SCSI card with an adequate battery-backed write cache.
Put the logs on SCSI disks on the caching drive, along with everything else.
Remove all of the IDE drives from the system. If you still have problems,
upgrade those 10K drives that cause the delays to 15K drives, and/or split
logs over multiple drives.

"Jean-David Beyer" <jd*****@exit109.com> wrote in message
news:10*************@corp.supernews.com...
When initially populating a database, it runs slower that I would have
supposed. Checking here and there, especially with iostat, reveals that
most of the time is spent writting the logfiles. In other words, I am not
compute limited, but I am not IO limited either (in terms of bytes/second
to the drives). The system is in IO-WAIT state most of the time, so I
assume the process is really seek limited.

It is true that my method of populating the database leaves a lot to be
desired. The input is many hundreds of spreadsheet pages with hundreds of
columns and several thousand rows each. I do not want relations with
hundreds of columns, so to normalize them, I read a row at a time and do a
lot of inserts of subsets of the columns into different relations.

When all is done, everything is fine.

The large tables in the database, and the indices are DMS on 10,000rpm
Ultra/320 LVD SCSI hard drives. The logfiles go on a 7,200rpm 100MHz EIDE
hard drive with 8 megabyte cache in the drive. Possibly putting that onto
one of the SCSI drives might help slightly, but I doubt it because it
would have to seek from the data partition to the log partition and that
would probably lose me any savings from the increased rotational speed.

Any suggestions? Fortunately, this is not an urgent problem, but it would
be nice to understand DB2 well enough to assure myself that this is about
as fast as I should want.

--
.~. Jean-David Beyer Registered Linux User 85642.
/V\ Registered Machine 241939.
/( )\ Shrewsbury, New Jersey http://counter.li.org
^^-^^ 16:10:00 up 1 day, 7:45, 3 users, load average: 1.32, 1.14, 1.04

Nov 12 '05 #7

P: n/a
Mark Yudkin wrote:
Ensure you're using a SCSI card with an adequate battery-backed write cache.
Put the logs on SCSI disks on the caching drive, along with everything else.
Remove all of the IDE drives from the system. If you still have problems,
upgrade those 10K drives that cause the delays to 15K drives, and/or split
logs over multiple drives.


.... or better yet - use battery backed RAM disk as log device ...

Jan M. Nelken
Nov 12 '05 #8

P: n/a
While we're there, check your scsi driver parms.

stuff like :
maxtags
maximumSGlist
numberofrequests

PM
Nov 12 '05 #9

P: n/a
He isn't using SCSI - he's using 7200 rpm EIDE for his log files. As a
result he's spending all his time waiting for disk rotation to occur.

"PM (pm3iinc-nospam) CGO" <PM (pm3iinc-nospam)@cgocable.ca> wrote in message
news:5i****************@charlie.risq.qc.ca...
While we're there, check your scsi driver parms.

stuff like :
maxtags
maximumSGlist
numberofrequests

PM

Nov 12 '05 #10

This discussion thread is closed

Replies have been disabled for this discussion.