473,799 Members | 3,276 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

ATA disks and RAID controllers for database servers

Dear all,

Here is the first installment concerning ATA disks and RAID controller
use in a database server. I happened to have a Solaris system to myself
this week, so took the opportunity to use it as a "control".

In this post I used the ATA RAID controller merely to enable UDMA 133
for an oldish x86 machine, the effect of any actual RAID level will
(hopefully) be examined subsequently.
So what I was attempting to examine here was : is it feasable to build a
reasonably well performing database server using ATA disks? (in
particular would disabling the ATA write cache spoil performance
completely?)
The Systems
-----------

Dell 410
2x700Mhz PIII 512Mb
Promise Fastrack TX2000 Controller
2x40G 7200RPM ATA-133 Maxtor Diamond +8 configured as JBOD
Freebsd 4.8 (options SMP APIC_IO i686)
Postgresql 7.4beta2 (-O2 -funroll-loops -fexpensive-optimizations
-march=i686)
ATA Write caching controlled via the loader.conf variable hw.ata.wc (1 = on)
Sun 280R
1x900Mhz USparc III 1024Mb
1x36G 10000RPM FCAL Sun (actually Seagate)
Solaris 8 (recommended patches)
Postgresql 7.4beta2 (-O2 -funroll-loops -fexpensive-optimizations)
The Tests
---------

1. Sequential and random writes and reads of a file twice the size of memory

Files were written using read(2), write(2) functions - buffered at 8K.
For the random case 10% of the file was sampled using lseek(2), and read
or written.
(see
http://techdocs.postgresql.org/marki...ork-1.0.tar.gz)
2. Postgresql pgbench benchmark program

This was run using the options :
-t 1000 [ 1000 transactions ]
-s 10 [ scale factor 10 ]
-c 1,2,4,8,16 [ 1-16 clients ]

Non default postgresql.conf settings were:
shared_buffers = 5000
wal_buffers = 100
checkpoint_segm ents = 10
A checkpoint was forced after each run to prevent cross run interference.
Results
-------
Test 1

System IO Operation Throughput(M/s) Options
------------------------------------------------
Sun seq write 21
seq read 48
random write 2.8
random read 2.2

Dell seq write 11 hw.ata.wc=0
seq read 50 hw.ata.wc=0
random write 1.27 hw.ata.wc=0
random read 4.2 hw.ata.wc=0

Dell seq write 20 hw.ata.wc=1
seq read 53 hw.ata.wc=1
random write 1.69 hw.ata.wc=1
random read 4.1 hw.ata.wc=1
Test 2

System Clients Throughput(tps) Options
------------------------------------------------
Sun 1 18
2 18
4 22
8 23
16 28

Dell 1 27 hw.ata.wc=0
2 38 hw.ata.wc=0
4 55 hw.ata.wc=0
8 58 hw.ata.wc=0
16 66 hw.ata.wc=0

Dell 1 82 hw.ata.wc=1
2 137 hw.ata.wc=1
4 166 hw.ata.wc=1
8 128 hw.ata.wc=1
16 117 hw.ata.wc=1
Conclusions
-----------

Test 1

As far as sequential reading goes, there is not much to pick and choose
between ATA and SCSI.

ATA with write caching off does only about half as well for as SCSI for
sequential writes. It also fares poorly at random writes - even with
write caching on.

The random read result was surprising - I was expecting SCSI to perform
better on all random operations (seek time on the SCSI drive is about
1/2 that of the ATA). The "my program is measuring wrong" syndrome
featured strongly, so I have run similar tests with Bonnie - it finds
the ATA drive can do 4 *times* more seeks/s - hmmm (Bonnie gets the same
sequential throughput numbers too).

A point to note for *both* systems is that all disks were new, so have
not yet 'burned in' - I don't know how significant this might be (anyone?).

Test 2

Hmmm, 3 year old Dell 410 hammers this year's Sun 280R (write caching on
or off). Now it is well known that Solaris is not the fastest platform
for Pg, so maybe let's contain the excitement here. I did experiment
with using bsdmalloc to improve Solaris memory performance - without a
significant improvement (any other ideas?).

But it seems safe to conclude that it's possible to construct a
reasonably well performing ATA based system - even if write caching is off.
Criticisms
----------

Using "-s 10" only produces a database of 160M - this is cacheable when
you have 512/1024M real memory, so maybe "-s 100" would defeat the
cache. I am currently running some tests with this configuration.

Comparing a dual processor Intel to single Sun is not fair - well, a
900Mhz UltraSparc III is *supposed* to correspond to a 1.4Ghz Intel, so
2x700Mhz PIIIs should be a fair match. However it does look like the two
PIIIs hammer it a bit...

regards

Mark
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 12 '05 #1
5 2216
Dear all,

Here is the second installment concerning ATA disks and RAID controller
use in a database server.

In this post a 2 disk RAID0 configuration is tested, and the results
compared to the JBOD configuration in the previous message.

So again, what I was attempting to examine here was : is it feasable to
build a reasonably well performing database server using ATA disks? (in
particular would disabling the ATA write cache spoil performance
completely?)

The System
----------

Dell 410
2x700Mhz PIII 512Mb
Promise Fastrack TX2000 Controller
2x40G 7200RPM ATA-133 Maxtor Diamond +8 configured as JBOD or
2x40G 7200RPM ATA-133 Maxtor Diamond +8 configured as RAID0
Freebsd 4.8 (options SMP APIC_IO i686)
Postgresql 7.4beta2 (-O2 -funroll-loops -fexpensive-optimizations
-march=i686)
ATA Write caching controlled via the loader.conf variable hw.ata.wc (1 = on)
The Tests
---------

1. Sequential and random writes and reads of a file twice the size of memory

Files were written using read(2), write(2) functions - buffered at 8K.
For the random case 10% of the file was sampled using lseek(2), and read
or written. (see
http://techdocs.postgresql.org/marki...ork-1.0.tar.gz)

The filesystem was built with newfs options :
-U -b 32768 -f 4096 [ softupdates, 32K blocks, 4K fragments ]

The RAID0 strip size was 128K. This gave the best performance (32K, 64K
were tried - I got tired of rebuilding the system at this point, so 256K
and above may be better).
2. Postgresql pgbench benchmark program

This was run using the options :
-t 1000 [ 1000 transactions ]
-s 10 [ scale factor 10 ]
-c 1,2,4,8,16 [ 1-16 clients ]

Non default postgresql.conf settings were:
shared_buffers = 5000
wal_buffers = 100
checkpoint_segm ents = 10
A checkpoint was forced after each run to prevent cross run
interference. Three runs through were performed for each configuration,
and the results averaged. A new database was created for each 1-16
client "set" of runs.
Results
-------

Test 1

System IO Operation Throughput(M/s) Options
------------------------------------------------
Dell
JBOD seq write 11 hw.ata.wc=0
seq read 50 hw.ata.wc=0
random write 1.3 hw.ata.wc=0
random read 4.2 hw.ata.wc=0

seq write 20 hw.ata.wc=1
seq read 53 hw.ata.wc=1
random write 1.7 hw.ata.wc=1
random read 4.1 hw.ata.wc=1

RAID0 seq write 13 hw.ata.wc=0
seq read 100 hw.ata.wc=0
random write 1.7 hw.ata.wc=0
random read 4.2 hw.ata.wc=0

seq write 38 hw.ata.wc=1
seq read 100 hw.ata.wc=1
random write 2.5 hw.ata.wc=1
random read 4.3 hw.ata.wc=1

Test 2

System Clients Throughput(tps) Options
------------------------------------------------
Dell
JBOD 1 27 hw.ata.wc=0
2 38 hw.ata.wc=0
4 55 hw.ata.wc=0
8 58 hw.ata.wc=0
16 66 hw.ata.wc=0

1 82 hw.ata.wc=1
2 137 hw.ata.wc=1
4 166 hw.ata.wc=1
8 128 hw.ata.wc=1
16 117 hw.ata.wc=1

RAID0 1 33 hw.ata.wc=0
2 39 hw.ata.wc=0
4 61 hw.ata.wc=0
8 73 hw.ata.wc=0
16 80 hw.ata.wc=0

1 95 hw.ata.wc=1
2 156 hw.ata.wc=1
4 194 hw.ata.wc=1
8 179 hw.ata.wc=1
16 144 hw.ata.wc=1
Conclusions
-----------

Test 1

It is clear that with write caching on the RAID0 configuration greatly
improves sequential read and write performance - almost twice as fast as
the JBOD case. The random write performance is improved by a reasonable
factor too.

For write caching disabled, the write rates are similar to the JBOD
case. This *may* indicate some design issue in the Promise controller.
Test 2

For write caching on or off, the RAID0 configuration is faster - by
about 18 percent.
General

Clearly it is possible to obtain very good performance with write
caching on using RAID0, and if you have a UPS together with good backup
practice then this could be the way to go.

With caching off there is a considerable decrease in performance,
however this performance may be "good enough" if viewed in a
cost-benefit-safely manner.
Criticisms
----------

It would have been good to have two SCSI disks to test in the Dell
machine (as opposed to using a Sun 280R), unfortunately I can't justify
the cost of them for this test :-(. However there are some examples of
similar comparisons in the Postgresql General thread "Recomended FS"
(without an ATA RAID controller).
Mark


---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 12 '05 #2
On Sat, 15 Nov 2003 14:07:40 +1300
Mark Kirkwood <ma****@paradis e.net.nz> wrote:

Clearly it is possible to obtain very good performance with write
caching on using RAID0, and if you have a UPS together with good backup
practice then this could be the way to go.

With caching off there is a considerable decrease in performance,
however this performance may be "good enough" if viewed in a
cost-benefit-safely manner.


UNless the controller itself has a battery backed cache it is dangerous - there are many more failures than losing power. Ie, blowing out the power supply or cpu. We've burnt up a fair share of cpu's over the years. Luckly on a Sun it isn't that big a deal.. but on x86. wel... you get the idea.

--
Jeff Trout <je**@jefftrout .com>
http://www.jefftrout.com/
http://www.stuarthamm.net/

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 12 '05 #3
Clinging to sanity, th******@torgo. 978.org (Jeff) mumbled into her beard:
On Sat, 15 Nov 2003 14:07:40 +1300
Mark Kirkwood <ma****@paradis e.net.nz> wrote:

Clearly it is possible to obtain very good performance with write
caching on using RAID0, and if you have a UPS together with good backup
practice then this could be the way to go.

With caching off there is a considerable decrease in performance,
however this performance may be "good enough" if viewed in a
cost-benefit-safely manner.


UNless the controller itself has a battery backed cache it is
dangerous - there are many more failures than losing power. Ie,
blowing out the power supply or cpu. We've burnt up a fair share of
cpu's over the years. Luckly on a Sun it isn't that big a
deal.. but on x86. wel... you get the idea.


Furthermore, if the disk drives are lying to the controller, it's
anybody's guess whether or not data ever actually gets to the disk.

When is it safe to let blocks expire out of the controller cache?

If your computer can't know if the data has been written (because of
drives that lie), I can't imagine how the controller would (since the
drives are lying to the controller, too).
--
If this was helpful, <http://svcs.affero.net/rm.php?r=cbbrow ne> rate me
http://www3.sympatico.ca/cbbrowne/
"The primary difference between computer salesmen and used car
salesmen is that used car salesmen know when they're lying to you."
Nov 12 '05 #4
Furthermore, if the disk drives are lying to the controller, it's
anybody's guess whether or not data ever actually gets to the disk.

When is it safe to let blocks expire out of the controller cache?

If your computer can't know if the data has been written (because of
drives that lie), I can't imagine how the controller would (since the
drives are lying to the controller, too).


As I understand it, there is only 1 lie : the actual write to the disk.
The receipt into the drive *cache* is not lied about - hence the
discussion on mlist.limux.ker nel about capacitors to allow enough power
for a cache flush in a power off situation.

regards

Mark


---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 12 '05 #5
Jeff wrote:

UNless the controller itself has a battery backed cache it is dangerous - there are many more failures than losing power. Ie, blowing out the power supply or cpu. We've burnt up a fair share of cpu's over the years. Luckly on a Sun it isn't that big a deal.. but on x86. wel... you get the idea.

Agreed. Power supply failure seems to be an ever present menace - had
one last month on a Sun E220, 3 years old - it's saying "replace me" :-)

regards

Mark
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 12 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
5063
by: abefuzzleduser2 | last post by:
We have a Dell 6450 quad 1.6MHz, 4GB RAM running SQL 2000 on Win 2000 server with 2 PERC3/QC cards. Server has 4 internal drives configured as two raid-1 drives (OS and SQL logs). Server has 7 more drives in external drive rack for RAID-5 SQL data. We did not have a UPS available for server at installation so I set cache as write-through, even though should not matter since it writes back at restart right?. I am planning on switching back...
1
1225
by: Programmer | last post by:
Well here is my problem I have a web application running in 2 web servers and I have also a cluster system. I want using the web application to write some files in the cluster discs. So I have created in my web servers a virtual directory located on the cluster discs.
5
1835
by: Dave | last post by:
I am recommending that we change our Raid Configuration on some of our Servers from Raid 5 to Raid 0+1; we are experiencing severe IO bottlenecks. Our hardware guys are pushing back a bit. They claim that Dell has a weird implementation of 0+1 and told me something about one drive filling up before it begins to write to the next. They claimed that this gets rid of most of the benefits of 0+1.
13
15416
by: Dave | last post by:
RAID 5 beats RAID 10 Can I get some feedback on these results? We were having some serious IO issues according to PerfMon so I really pushed for RAID 10. The results are not what I expected. I have 2 identical servers. Hardware: PowerEdge 2850
17
2231
by: boa | last post by:
I'm currently planning disk layouts and use for a new version of our database. The current version has all data and indexes in the default filegroup, placed on one big raid-5 array(6 drives) along with the transaction log. Performance is not the best, as you may imagine... Next week we will add another 14 drives and organize them in different combos of raid-10 and raid-1, and then create several filegroups and place tables and index data...
110
10633
by: alf | last post by:
Hi, is it possible that due to OS crash or mysql itself crash or some e.g. SCSI failure to lose all the data stored in the table (let's say million of 1KB rows). In other words what is the worst case scenario for MyISAM backend? Also is it possible to not to lose data but get them corrupted?
8
2259
by: justin.merth | last post by:
Is there any benefit in creating seperate file groups for a partitioned table on a multi-processor server with RAID5 and 1 Logical Drive?
8
2225
by: vishnu | last post by:
Hi, How do we count the total physical disks assigned to a DB2 database on a RAID 5, solaris environment. Thank you.
3
3239
by: peanutbuttercravings | last post by:
I don't know much about db2 but I need to move a filesystem from a striped logical volume to raid5? And are there any implications moving the filesystems which hold db2 tables to sharks? Is there anything I have to do within db2? This is an aix environment. Thanks a lot.
0
9687
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9541
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
1
10228
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10027
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
6805
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5585
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4141
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3759
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2938
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.