DB2 vs MySQL - performance on large tables

Bing Wu

Hi all,

I am running a database containing large datasets:

frames: 20 thousand rows,
coordinates: 170 million row.

The database has been implemented with:

IBM DB2 v8.1
MySQL v3.23.54

using similar DDL and same indexes. And I have run a SQL joining two tables:

select c.x,c.y,c.z from coordinate as c,frame as f
where fid<1000 and f.tid=1 and f.id=c.fid

The query operates on 3400 billion possible joints and returns about 8.5 million records. It seems MySQL outperforms DB2 well ahead:

DB2: 7 minutes
MySQL: 3 minutes

MySQL is running with default settings while DB2:

Buffer pool: 500 MB (1GB system memory)
More seetings see: previous message titled -
"Improve query performance on large table"

Should I assume MySQL does run much faster than DB2 on large tables? Could anyone advice? BTW, one thing I should also mention that returned data formats:

DB2: 1.279200e+01 -1.596900e+01 -3.979500e+01
MySQL: 12.792 -15.969 -39.795

(x,y,z are defined as REAL in both databases)

I doubt this could make the performance difference.
Best regards,

Bing

Nov 12 '05 #1

Subscribe Post Reply

25456

Blair Adamache

No, you should not assume that SQL Server outperforms DB2 on large
tables. Have you run the configuration advisor from the control center
to get suggested values on config parameters? What about the db2advis
command?

If you run the query several times so the buffer pool gets filled with
data does performance improve on either or both systems? Have you
updated statistics (runstats) on the tables, distribution of values, and
indexes?

Bing Wu wrote:

Hi all,

I am running a database containing large datasets:

frames: 20 thousand rows,
coordinates: 170 million row.

The database has been implemented with:

IBM DB2 v8.1
MySQL v3.23.54

using similar DDL and same indexes. And I have run a SQL joining two
tables:

select c.x,c.y,c.z from coordinate as c,frame as f where fid<1000
and f.tid=1 and f.id=c.fid

The query operates on 3400 billion possible joints and returns about 8.5
million records. It seems MySQL outperforms DB2 well ahead:

DB2: 7 minutes
MySQL: 3 minutes

MySQL is running with default settings while DB2:

Buffer pool: 500 MB (1GB system memory)
More seetings see: previous message titled -
"Improve query performance on large table"

Should I assume MySQL does run much faster than DB2 on large tables?
Could anyone advice? BTW, one thing I should also mention that returned
data formats:

DB2: 1.279200e+01 -1.596900e+01 -3.979500e+01
MySQL: 12.792 -15.969 -39.795

(x,y,z are defined as REAL in both databases)

I doubt this could make the performance difference.
Best regards,

Bing

Nov 12 '05 #2

Mark A

"Blair Adamache" <ba*******@2muchspam.yahoo.com> wrote in message
news:bo**********@hanover.torolab.ibm.com...

No, you should not assume that SQL Server outperforms DB2 on large
tables. Have you run the configuration advisor from the control center
to get suggested values on config parameters? What about the db2advis
command?

If you run the query several times so the buffer pool gets filled with
data does performance improve on either or both systems? Have you
updated statistics (runstats) on the tables, distribution of values, and
indexes?

He is talking about MySQL, not MS SQL Server.

Nov 12 '05 #3

Blair Adamache

Sorry. MySQL has less function than DB2, so it may have to execute less
code path for some operations, and they may be faster. My advice on DB2
is still worth following. If DB2 is "over-optimizing" you could try the
query with an optimization level somewhere from 1 ro 4.

Mark A wrote:

"Blair Adamache" <ba*******@2muchspam.yahoo.com> wrote in message
news:bo**********@hanover.torolab.ibm.com...
No, you should not assume that SQL Server outperforms DB2 on large
tables. Have you run the configuration advisor from the control center
to get suggested values on config parameters? What about the db2advis
command?

If you run the query several times so the buffer pool gets filled with
data does performance improve on either or both systems? Have you
updated statistics (runstats) on the tables, distribution of values, and
indexes?

He is talking about MySQL, not MS SQL Server.

Nov 12 '05 #4

Ken

Bing Wu <bi**@biop.ox.ac.uk> wrote in message news:<bo**********@news.ox.ac.uk>...

Should I assume MySQL does run much faster than DB2 on large tables? Could anyone advice?

I wouldn't make that assumption - seems like you've only got a single
data point here.

I'm no expert on mysql - but from what I'm aware it lacks much of the
configurability of the big databases. So, with that in mind...

- you've got a dual cpu box there - are you taking advantage of that?
you won't get the parallism advantages that you could get out of an
8-way, but it's a lot better than nothing. And I don't think mysql
can.

- what's your io subsystem look like? just a single disk? ah well,
on the other hand, if you've got the data striped across 4-14 drives,
or can split indexes and various tables across different disks there
are benefits to be gained.

- you're sorting - are you spilling to disk or keeping it all in
memory? It can make a tremendous difference.

And as others have mentioned, running the wizards to get configuration
and performance advise would probably be a great first step...

Nov 12 '05 #5

Bing Wu

Thanks all for all invaluable comments.

Ken wrote:

I wouldn't make that assumption - seems like you've only got a single
data point here.

I'm no expert on mysql - but from what I'm aware it lacks much of the
configurability of the big databases. So, with that in mind...

- you've got a dual cpu box there - are you taking advantage of that?
you won't get the parallism advantages that you could get out of an
8-way, but it's a lot better than nothing. And I don't think mysql
can.

I agree MySQL lacks of high-end features as an enterprise database.
- what's your io subsystem look like? just a single disk? ah well,
on the other hand, if you've got the data striped across 4-14 drives,
or can split indexes and various tables across different disks there
are benefits to be gained.
It is a single 74GB SCSI disk. The server is Sun Cobalt F550 with dual CPU.
- you're sorting - are you spilling to disk or keeping it all in
memory? It can make a tremendous difference.

And as others have mentioned, running the wizards to get configuration
and performance advise would probably be a great first step...

I have run the wizard and re-config the database, but it still seems no chnages on the performance. I wonder if I have missed anything on the tuning since 7 minutes is still a large gap behind Open-Source database MySQL.

Best regards,

Bing

Nov 12 '05 #6

Database Guy

Bing Wu <bi**@biop.ox.ac.uk> wrote:

frames: 20 thousand rows,
coordinates: 170 million row. select c.x,c.y,c.z from coordinate as c,frame as f
where fid<1000 and f.tid=1 and f.id=c.fid The query operates on 3400 billion possible joints and returns about 8.5
million records. It seems MySQL outperforms DB2 well ahead: DB2: 7 minutes
MySQL: 3 minutes

To maximise i/o efficiency for this query, you could consider
clustering frame on tid, id, and clustering coordinate on fid. Make
sure they are 100% clustered, then re-run RUNSTATS and then the query.
All the required data for the join will now be contiguous.

However there may be other good reasons not to do this, or you may
have this setup already. If you are saying DB2 takes 7 minutes to
start returning the result set, I think that can be greatly improved
(perhaps to about 1 minute, assuming your rows are as short as they
seem). If OTOH you're saying that DB2 joins and returns to the client
the 8.5m rows in 7 minutes, i.e. 20,000 rows per second, I'd say you
are doing as well as you can expect - most DBMS's are fast at
calculating result sets but slow at returning them (I guess mySQL is
different). It's generally best to do as much work as possible inside
the DBMS to reduce result set size.

You could try running multiple streams to speed things up the
extraction rate, or check out the DB2 Fast Unload utility (extra
cost). If anyone knows of any parameters you can tweak that speed up
bulk unloads to the client, I'd appreciate that info.
DG

Nov 12 '05 #7

Bing Wu

Database Guy wrote:

To maximise i/o efficiency for this query, you could consider
clustering frame on tid, id, and clustering coordinate on fid. Make
sure they are 100% clustered, then re-run RUNSTATS and then the query.
All the required data for the join will now be contiguous.

Thanks. The table 'coodinate' has been set to append mode and so cluster is not possible. While frame table doesn't really matter at the mo. While I run the query against the 'coordinate' table:

select x,y,x from coordinate where fid<1000

It takes more than 10 minutes to finish. Don't really understand why?
However there may be other good reasons not to do this, or you may
have this setup already. If you are saying DB2 takes 7 minutes to
start returning the result set, I think that can be greatly improved
(perhaps to about 1 minute, assuming your rows are as short as they
seem). If OTOH you're saying that DB2 joins and returns to the client
the 8.5m rows in 7 minutes, i.e. 20,000 rows per second, I'd say you
are doing as well as you can expect - most DBMS's are fast at
calculating result sets but slow at returning them (I guess mySQL is
different). It's generally best to do as much work as possible inside
the DBMS to reduce result set size.
It seems I/O performance somehow let DB2 down. DB2 starts pump out the data once I issued the command:

$ db2 "select ..." > out.file

And I can observe that out.file growns at about 700KB/s. While in the case of MySQL, the speed simply doubles.
You could try running multiple streams to speed things up the
extraction rate, or check out the DB2 Fast Unload utility (extra
cost). If anyone knows of any parameters you can tweak that speed up
bulk unloads to the client, I'd appreciate that info.

Thanks for info. I'll keep trying.

Best wishes,

Bing

Nov 12 '05 #8

Leo

Run runstats on both tables and the associated indexes. Then make
sure you have the explain tables created (the ddl for them in under
~/sqllib/misc/EXPLAIN.DDL I believe). Then, put you query in a file
and do the following:

db2 set current explain mode explain
db2 -tf filename
db2 set current explain mode no
db2exfmt -d database_name -g TIC -1 -o filename.out

Then post the contents of filename.out so we can all see the explain
output. It will make it much easier to solve the problem.
Leo

Bing Wu <bi**@biop.ox.ac.uk> wrote in message news:<bo**********@news.ox.ac.uk>...

Database Guy wrote:
To maximise i/o efficiency for this query, you could consider
clustering frame on tid, id, and clustering coordinate on fid. Make
sure they are 100% clustered, then re-run RUNSTATS and then the query.
All the required data for the join will now be contiguous.

Thanks. The table 'coodinate' has been set to append mode and so cluster is not possible. While frame table doesn't really matter at the mo. While I run the query against the 'coordinate' table:

select x,y,x from coordinate where fid<1000

It takes more than 10 minutes to finish. Don't really understand why?
However there may be other good reasons not to do this, or you may
have this setup already. If you are saying DB2 takes 7 minutes to
start returning the result set, I think that can be greatly improved
(perhaps to about 1 minute, assuming your rows are as short as they
seem). If OTOH you're saying that DB2 joins and returns to the client
the 8.5m rows in 7 minutes, i.e. 20,000 rows per second, I'd say you
are doing as well as you can expect - most DBMS's are fast at
calculating result sets but slow at returning them (I guess mySQL is
different). It's generally best to do as much work as possible inside
the DBMS to reduce result set size.

It seems I/O performance somehow let DB2 down. DB2 starts pump out the data once I issued the command:

$ db2 "select ..." > out.file

And I can observe that out.file growns at about 700KB/s. While in the case of MySQL, the speed simply doubles.
You could try running multiple streams to speed things up the
extraction rate, or check out the DB2 Fast Unload utility (extra
cost). If anyone knows of any parameters you can tweak that speed up
bulk unloads to the client, I'd appreciate that info.

Thanks for info. I'll keep trying.

Best wishes,

Bing

Nov 12 '05 #9

Bing Wu

Leo wrote:

Run runstats on both tables and the associated indexes. Then make
sure you have the explain tables created (the ddl for them in under
~/sqllib/misc/EXPLAIN.DDL I believe). Then, put you query in a file
and do the following:

db2 set current explain mode explain
db2 -tf filename
db2 set current explain mode no
db2exfmt -d database_name -g TIC -1 -o filename.out

Then post the contents of filename.out so we can all see the explain
output. It will make it much easier to solve the problem.

Thanks Leo. I have attached the output below. Please advice.

Bing
DB2 Universal Database Version 8.1, 5622-044 (c) Copyright IBM Corp. 1991, 2002
Licensed Material - Program Property of IBM
IBM DATABASE 2 Explain Table Format Tool

******************** EXPLAIN INSTANCE ********************

DB2_VERSION: 08.01.3
SOURCE_NAME: DB2ADVIS
SOURCE_SCHEMA: NULLID
SOURCE_VERSION:
EXPLAIN_TIME: 2003-11-06-18.11.22.180128
EXPLAIN_REQUESTER: DB2ADMIN

Database Context:
----------------
Parallelism: None
CPU Speed: 6.612820e-07
Comm Speed: 1
Buffer Pool size: 138820
Sort Heap size: 758
Database Heap size: 1200
Lock List size: 100
Maximum Lock List: 60
Average Applications: 1
Locks Available: 6780

Package Context:
---------------
SQL Type: Dynamic
Optimization Level: 5
Blocking: Block Unamibiguous Cursors
Isolation Level: Cursor Stability

---------------- STATEMENT 1 SECTION 26 ----------------
QUERYNO: 1
QUERYTAG:
Statement Type: Select
Updatable: No
Deletable: No
Query Degree: 1

Original Statement:
------------------
SELECT C.X,C.Y
FROM COORDINATE AS C, FRAME AS F
WHERE F.TID=1 AND C.FID=F.ID AND F.ID<1000 AND F.ID>0
ORDER BY C.FID,C.AID
Optimized Statement:
-------------------
SELECT Q2.X AS "X", Q2.Y AS "Y", Q2.FID, Q2.AID
FROM DB2ADMIN.FRAME AS Q1, DB2ADMIN.COORDINATE AS Q2
WHERE (0 < Q1.ID) AND (Q1.ID < 1000) AND (Q2.FID = Q1.ID) AND (Q1.TID = 1)
ORDER BY Q2.FID, Q2.AID

Access Plan:
-----------
Total Cost: 11282.7
Query Degree: 1

Rows
RETURN
( 1)
Cost
I/O
|
27342.7
NLJOIN
( 2)
11282.7
448.963
/------+-----\
80.432 339.948
IXSCAN IXSCAN
( 3) ( 4)
25.1445 139.995
1 5.56946
| |
20108 1.7087e+08
INDEX: SYSTEM INDEX: SYSTEM
IDX0311061811220 IDX0311061811220

1) RETURN: (Return Result)
Cumulative Total Cost: 11282.7
Cumulative CPU Cost: 8.86135e+07
Cumulative I/O Cost: 448.963
Cumulative Re-Total Cost: 57.5291
Cumulative Re-CPU Cost: 8.69963e+07
Cumulative Re-I/O Cost: 0
Cumulative First Row Cost: 100.13
Estimated Bufferpool Buffers: 449.963

Arguments:
---------
BLDLEVEL: (Build level)
DB2 v8.1.0.24 : s030728

Input Streams:
-------------
5) From Operator #2

Estimated number of rows: 27342.7
Number of columns: 2
Subquery predicate ID: Not Applicable

Column Names:
------------
+Q3.Y+Q3.X
2) NLJOIN: (Nested Loop Join)
Cumulative Total Cost: 11282.7
Cumulative CPU Cost: 8.86135e+07
Cumulative I/O Cost: 448.963
Cumulative Re-Total Cost: 57.5291
Cumulative Re-CPU Cost: 8.69963e+07
Cumulative Re-I/O Cost: 0
Cumulative First Row Cost: 100.13
Estimated Bufferpool Buffers: 449.963

Arguments:
---------
EARLYOUT: (Early Out flag)
NONE
FETCHMAX: (Override for FETCH MAXPAGES)
IGNORE
ISCANMAX: (Override for ISCAN MAXPAGES)
IGNORE

Predicates:
----------
6) Predicate used in Join
Relational Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 4.97315e-05

Predicate Text:
--------------
(Q2.FID = Q1.ID)
Input Streams:
-------------
2) From Operator #3

Estimated number of rows: 80.432
Number of columns: 1
Subquery predicate ID: Not Applicable

Column Names:
------------
+Q1.ID(A)

4) From Operator #4

Estimated number of rows: 339.948
Number of columns: 4
Subquery predicate ID: Not Applicable

Column Names:
------------
+Q2.FID(A)+Q2.AID(A)+Q2.Y(A)+Q2.X(A)
Output Streams:
--------------
5) To Operator #1

Estimated number of rows: 27342.7
Number of columns: 2
Subquery predicate ID: Not Applicable

Column Names:
------------
+Q3.Y+Q3.X
3) IXSCAN: (Index Scan)
Cumulative Total Cost: 25.1445
Cumulative CPU Cost: 218454
Cumulative I/O Cost: 1
Cumulative Re-Total Cost: 0.114397
Cumulative Re-CPU Cost: 172993
Cumulative Re-I/O Cost: 0
Cumulative First Row Cost: 25.0529
Estimated Bufferpool Buffers: 2

Arguments:
---------
JN INPUT: (Join input leg)
OUTER
MAXPAGES: (Maximum pages for prefetch)
1
PREFETCH: (Type of Prefetch)
NONE
ROWLOCK : (Row Lock intent)
NEXT KEY SHARE
SCANDIR : (Scan Direction)
FORWARD
TABLOCK : (Table Lock intent)
INTENT SHARE

Predicates:
----------
4) Start Key Predicate
Relational Operator: Less Than (<)
Subquery Input Required: No
Filter Factor: 0.333333

Predicate Text:
--------------
(0 < Q1.ID)

5) Stop Key Predicate
Relational Operator: Less Than (<)
Subquery Input Required: No
Filter Factor: 0.333333

Predicate Text:
--------------
(Q1.ID < 1000)

7) Start Key Predicate
Relational Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 0.04

Predicate Text:
--------------
(Q1.TID = 1)

7) Stop Key Predicate
Relational Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 0.04

Predicate Text:
--------------
(Q1.TID = 1)
Input Streams:
-------------
1) From Object SYSTEM.IDX031106181122000

Estimated number of rows: 20108
Number of columns: 3
Subquery predicate ID: Not Applicable

Column Names:
------------
+Q1.ID(A)+Q1.$RID$+Q1.TID
Output Streams:
--------------
2) To Operator #2

Estimated number of rows: 80.432
Number of columns: 1
Subquery predicate ID: Not Applicable

Column Names:
------------
+Q1.ID(A)
4) IXSCAN: (Index Scan)
Cumulative Total Cost: 139.995
Cumulative CPU Cost: 1.14706e+06
Cumulative I/O Cost: 5.56946
Cumulative Re-Total Cost: 89.9584
Cumulative Re-CPU Cost: 1.0916e+06
Cumulative Re-I/O Cost: 3.56946
Cumulative First Row Cost: 75.0767
Estimated Bufferpool Buffers: 71777.7

Arguments:
---------
JN INPUT: (Join input leg)
INNER
MAXPAGES: (Maximum pages for prefetch)
3
PREFETCH: (Type of Prefetch)
NONE
ROWLOCK : (Row Lock intent)
NEXT KEY SHARE
SCANDIR : (Scan Direction)
FORWARD
TABLOCK : (Table Lock intent)
INTENT SHARE

Predicates:
----------
2) Sargable Predicate
Relational Operator: Less Than (<)
Subquery Input Required: No
Filter Factor: 0.0400051

Predicate Text:
--------------
(Q2.FID < 1000)

2) Stop Key Predicate
Relational Operator: Less Than (<)
Subquery Input Required: No
Filter Factor: 0.0400051

Predicate Text:
--------------
(Q2.FID < 1000)

3) Sargable Predicate
Relational Operator: Less Than (<)
Subquery Input Required: No
Filter Factor: 1

Predicate Text:
--------------
(0 < Q2.FID)

3) Start Key Predicate
Relational Operator: Less Than (<)
Subquery Input Required: No
Filter Factor: 1

Predicate Text:
--------------
(0 < Q2.FID)

6) Start Key Predicate
Relational Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 4.97315e-05

Predicate Text:
--------------
(Q2.FID = Q1.ID)

6) Stop Key Predicate
Relational Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 4.97315e-05

Predicate Text:
--------------
(Q2.FID = Q1.ID)
Input Streams:
-------------
3) From Object SYSTEM.IDX031106181122000

Estimated number of rows: 1.7087e+08
Number of columns: 5
Subquery predicate ID: Not Applicable

Column Names:
------------
+Q2.FID(A)+Q2.AID(A)+Q2.Y(A)+Q2.X(A)+Q2.$RID$
Output Streams:
--------------
4) To Operator #2

Estimated number of rows: 339.948
Number of columns: 4
Subquery predicate ID: Not Applicable

Column Names:
------------
+Q2.FID(A)+Q2.AID(A)+Q2.Y(A)+Q2.X(A)
Objects Used in Access Plan:
---------------------------

Schema: SYSTEM
Name: IDX031106181122000
Type: Index
Time of creation: 2003-11-06-18.11.22.183114
Last statistics update:
Number of columns: 2
Number of rows: 20108
Width of rows: -1
Number of buffer pool pages: 1257
Distinct row values: No
Tablespace name: USERSPACE1
Tablespace overhead: 24.100000
Tablespace transfer rate: 0.900000
Source for statistics: Single Node
Prefetch page count: 2560
Container extent page count: 32
Index clustering statistic: 20.000000
Index leaf pages: 142
Index tree levels: 2
Index full key cardinality: 20108
Index first key cardinality: 20108
Index first 2 keys cardinality: -1
Index first 3 keys cardinality: -1
Index first 4 keys cardinality: -1
Index sequential pages: -1
Index page density: -1
Index avg sequential pages: -9
Index avg gap between sequences:-9
Index avg random pages: -9
Fetch avg sequential pages: -9
Fetch avg gap between sequences:-9
Fetch avg random pages: -9
Index RID count: 0
Index deleted RID count: 0
Index empty leaf pages: 0

Schema: SYSTEM
Name: IDX031106181122000
Type: Index
Time of creation: 2003-11-06-18.11.22.183139
Last statistics update:
Number of columns: 4
Number of rows: 170870376
Width of rows: -1
Number of buffer pool pages: 1275801
Distinct row values: Yes
Tablespace name: USERSPACE1
Tablespace overhead: 24.100000
Tablespace transfer rate: 0.900000
Source for statistics: Single Node
Prefetch page count: 2560
Container extent page count: 32
Index clustering statistic: 20.000000
Index leaf pages: 1794140
Index tree levels: 4
Index full key cardinality: 170870376
Index first key cardinality: 170870376
Index first 2 keys cardinality: -1
Index first 3 keys cardinality: -1
Index first 4 keys cardinality: -1
Index sequential pages: -1
Index page density: -1
Index avg sequential pages: -9
Index avg gap between sequences:-9
Index avg random pages: -9
Fetch avg sequential pages: -9
Fetch avg gap between sequences:-9
Fetch avg random pages: -9
Index RID count: 0
Index deleted RID count: 0
Index empty leaf pages: 0

Nov 12 '05 #10

Bing Wu

Blair Adamache wrote:

Sorry. MySQL has less function than DB2, so it may have to execute less
code path for some operations, and they may be faster. My advice on DB2
is still worth following. If DB2 is "over-optimizing" you could try the
query with an optimization level somewhere from 1 ro 4.

The URL below proves that 'select' operation in MySQL is about 2 times faster than DB2:

http://www.mysql.com/information/ben...-relative.html

So 7 minutes vs 3 minutes is a resonable result :-).

Cheers,

Bing

Nov 12 '05 #11

Bing Wu

Here is another evidence:

http://www.bocabeacon.com/ycom/help/mysql.htm

Bing

The URL below proves that 'select' operation in MySQL is about 2 times
faster than DB2:

http://www.mysql.com/information/ben...-relative.html
So 7 minutes vs 3 minutes is a resonable result :-).

Cheers,

Bing

Nov 12 '05 #12

Blair Adamache

I have three concerns about this url and why I feel it proves little:

1. it lists the DB2 Version as DB2 5 (the report is from 1997). Since
then, DB2 has had 5 new releases (DB2 v5.2, 6.1, 7.1, 7.2, 8.1) and
published multiple TPC-H benchmarks on Linux, and two TPC-C benchmarks
on AIX. So I feel that DB2 performance has improved since 1997. A lot
has changed in 1997: Clinton is no longer president, and DB2 has come
out on Linux with 4 releases (v6.1 through v8.1).

2. although the URL claims to be a benchmark, it can't prove anything,
since a true benchmark needs a level playing field for comparisons. DB2
was not available on Linux in 1997, so they probably compared DB2 on an
old version of Windows, like NT.

3. the URL could be biased (just as I am) - after all, why list MySQL
first. True benchmarks are released and audited by impartial
organizations like TPC.

Bing Wu wrote:

Blair Adamache wrote:
Sorry. MySQL has less function than DB2, so it may have to execute
less code path for some operations, and they may be faster. My advice
on DB2 is still worth following. If DB2 is "over-optimizing" you
could try the query with an optimization level somewhere from 1 ro 4.

The URL below proves that 'select' operation in MySQL is about 2 times
faster than DB2:

http://www.mysql.com/information/ben...-relative.html
So 7 minutes vs 3 minutes is a resonable result :-).

Cheers,

Bing

Nov 12 '05 #13

Blair Adamache

Sorry, but this is biased as well. It's not evidence and it's not proof.
Why does the URL below give MySQL two chances to run the tests? One with
ODBC and one without? Is this fair to the other products listed? I'm
sure the other databases in the list would be faster if they could use
native methods and skip ODBC. This might be DB2's Call Level Interface
or static SQL.

Bing Wu wrote:

Here is another evidence:

http://www.bocabeacon.com/ycom/help/mysql.htm

Bing

The URL below proves that 'select' operation in MySQL is about 2 times
faster than DB2:
http://www.mysql.com/information/ben...-relative.html
So 7 minutes vs 3 minutes is a resonable result :-).

Cheers,

Bing

Nov 12 '05 #14

Bing Wu

There is a recent eWEEK Labs/PC Labs database benchmark:

http://www.eweek.com/article2/0,4149,1184846,00.asp

MySQL and Oracle top the performance benchmark on large accesses:

http://www.eweek.com/slideshow/0,301...o=1&i=1,00.asp

Bing

Blair Adamache wrote:

I have three concerns about this url and why I feel it proves little:

1. it lists the DB2 Version as DB2 5 (the report is from 1997). Since
then, DB2 has had 5 new releases (DB2 v5.2, 6.1, 7.1, 7.2, 8.1) and
published multiple TPC-H benchmarks on Linux, and two TPC-C benchmarks
on AIX. So I feel that DB2 performance has improved since 1997. A lot
has changed in 1997: Clinton is no longer president, and DB2 has come
out on Linux with 4 releases (v6.1 through v8.1).

2. although the URL claims to be a benchmark, it can't prove anything,
since a true benchmark needs a level playing field for comparisons. DB2
was not available on Linux in 1997, so they probably compared DB2 on an
old version of Windows, like NT.

3. the URL could be biased (just as I am) - after all, why list MySQL
first. True benchmarks are released and audited by impartial
organizations like TPC.

Bing Wu wrote:
Blair Adamache wrote:
Sorry. MySQL has less function than DB2, so it may have to execute
less code path for some operations, and they may be faster. My advice
on DB2 is still worth following. If DB2 is "over-optimizing" you
could try the query with an optimization level somewhere from 1 ro 4.

The URL below proves that 'select' operation in MySQL is about 2 times
faster than DB2:
http://www.mysql.com/information/ben...-relative.html
So 7 minutes vs 3 minutes is a resonable result :-).

Cheers,

Bing

Nov 12 '05 #15

Daniel Morgan

Comments in-line.

Blair Adamache wrote:

I have three concerns about this url and why I feel it proves little:

1. it lists the DB2 Version as DB2 5 (the report is from 1997). Since
then, DB2 has had 5 new releases (DB2 v5.2, 6.1, 7.1, 7.2, 8.1) and
published multiple TPC-H benchmarks on Linux, and two TPC-C benchmarks
on AIX. So I feel that DB2 performance has improved since 1997. A lot
has changed in 1997: Clinton is no longer president, and DB2 has come
out on Linux with 4 releases (v6.1 through v8.1).
And, of course, it is equally true that MySQL and Linux have changed
too. So I agree ... the comparison is meaningless in 2003.
2. although the URL claims to be a benchmark, it can't prove anything,
since a true benchmark needs a level playing field for comparisons.
DB2 was not available on Linux in 1997, so they probably compared DB2
on an old version of Windows, like NT.
Pure speculation on all sides. A benchmark without context is meaningless.
3. the URL could be biased (just as I am) - after all, why list MySQL
first. True benchmarks are released and audited by impartial
organizations like TPC.

Bing Wu wrote:
Blair Adamache wrote:

I'd hardly call TPC impatial ... I'd call it useless. Performance is one
of very small part of the
rationale for purchasing an RDBMS. A more reasonable criteria would include:

1. Security
2. Stability
3. Scalability
4. Supportability
5. Third-party products that can be hosted such as financial systems
6. Price

From my experience on comparable hardware I think MySQL is indeed
faster. But that
doesn't make it better for any specific purpose. I can't recall the
last time a database
failed due to speed and speed alone. Most of the time speed issues can
be resolved by
just getting someone on the project that knows how to write and tune
decent code.

--
Daniel Morgan
http://www.outreach.washington.edu/e...ad/oad_crs.asp
http://www.outreach.washington.edu/e...oa/aoa_crs.asp
da******@x.washington.edu
(replace 'x' with a 'u' to reply)

Nov 12 '05 #16

Serge Rielau

The first link explains why the second link has to be viewed with
caution. And it shows the results of a rerun.
There was something odd about the original eweek setup.
Who knows. I recall an incident where a benchmark fell over because of
what turned out to be a dead battery in the disk cache.

From your earlier links e.g. I'm surprised by the Oracle numbers.
I don't buy a benchmark that makes our major competitor orders of
magnitudes worse than DB2, as much as I wish it were true ;-)

Cheers
Serge
--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab

Nov 12 '05 #17

Database Guy

Bing Wu <bi**@biop.ox.ac.uk> wrote in message news:<bo**********@news.ox.ac.uk>...
While I run the query against the 'coordinate' table:

select x,y,x from coordinate where fid<1000

It takes more than 10 minutes to finish. Don't really understand why?

Sounds as though this is just illustrating my point that with DB2 (and
I think most DBMS's) exporting a large resultset is often a bottleneck
on (I think) the coordinator agent. mySQL by all accounts is unusually
lightweight, and suspect it has a very different architecture that may
be the ideal choice if your sole criterion is speed of dumping tables
to falt file. Also suspect your query may even by serviced by a single
agent (thread) - maybe the reduction in inter-agent communication
overhead is a big advantage for queries that just dump tables???

Nobody's come back yet with suggestions on how to increase fetch rate
over 20,000/sec, so I guess you can take that as tacit acceptance that
this is more-or-less optimal for DB2 right now. I'd love a technical
explanation of why mySQL might be so much quicker under these
(specific) circumstances - is this basically inevitable?
DG

Nov 12 '05 #18

Ken

Daniel Morgan <da******@x.washington.edu> wrote in message news:<1068745990.118235@yasure>...

I'd hardly call TPC impatial ... I'd call it useless. Performance is one
of very small part of the
rationale for purchasing an RDBMS. A more reasonable criteria would include:

1. Security
2. Stability
3. Scalability
4. Supportability
5. Third-party products that can be hosted such as financial systems
6. Price
great response - a good solution requires a focus on more than just a
single element. In the past I've usually broken it a little
differently:
- adaptability
- manageability
- scalability and performance
- reliability, availability, and servicability
- affordability
Still, pretty similar.
From my experience on comparable hardware I think MySQL is indeed
faster.

Hmmm, but I've never seen a mysql benchmark that wasn't questionable
in validity (such as their comparisions to postgresql), or trivial in
implementation.

They're typically read-only activities on single-cpu boxes using small
databases. While that's a valid scenario - it hardly covers all the
bases - such as SMP or MPP servers, mixed workloads, concurrency,
backups, etc. And they're typically implemented by folks that don't
really know databases. E-Week? Please. And they compared mysql 4.0
to db2 7.2? Did they use innodb or isam for mysql? If not innodb
then mysql didn't have RI and it was an apples to oranges comparison.

Ken

Nov 12 '05 #19

Jan M. Nelken

Bing Wu wrote:

Hi all,

I am running a database containing large datasets:

frames: 20 thousand rows,
coordinates: 170 million row.

The database has been implemented with:

IBM DB2 v8.1
MySQL v3.23.54

using similar DDL and same indexes. And I have run a SQL joining two
tables:

select c.x,c.y,c.z from coordinate as c,frame as f where fid<1000
and f.tid=1 and f.id=c.fid

The query operates on 3400 billion possible joints and returns about 8.5
million records. It seems MySQL outperforms DB2 well ahead:

DB2: 7 minutes
MySQL: 3 minutes

Would you mind posting DDL for both tables? Inquiring minds want to know
and play with configuration a bit.

Jan M. Nelken

Nov 12 '05 #20

Daniel Morgan

Ken wrote:

Daniel Morgan <da******@x.washington.edu> wrote in message news:<1068745990.118235@yasure>...

I'd hardly call TPC impatial ... I'd call it useless. Performance is one
of very small part of the
rationale for purchasing an RDBMS. A more reasonable criteria would include:

1. Security
2. Stability
3. Scalability
4. Supportability
5. Third-party products that can be hosted such as financial systems
6. Price

great response - a good solution requires a focus on more than just a
single element. In the past I've usually broken it a little
differently:
- adaptability
- manageability
- scalability and performance
- reliability, availability, and servicability
- affordability
Still, pretty similar.

From my experience on comparable hardware I think MySQL is indeed
faster.

Hmmm, but I've never seen a mysql benchmark that wasn't questionable
in validity (such as their comparisions to postgresql), or trivial in
implementation.

They're typically read-only activities on single-cpu boxes using small
databases. While that's a valid scenario - it hardly covers all the
bases - such as SMP or MPP servers, mixed workloads, concurrency,
backups, etc. And they're typically implemented by folks that don't
really know databases. E-Week? Please. And they compared mysql 4.0
to db2 7.2? Did they use innodb or isam for mysql? If not innodb
then mysql didn't have RI and it was an apples to oranges comparison.

Ken

Part of my point, perhaps, wasn't too clear.

One can easily achieve great speed if one is willing to sacrifice some
of the other items on the
list (either your list or mine).

The basic methodology for pulling information out of a SQL database has
been in the
literature for years. I would think most anyone could store and retrieve
information with
great speed.

The question is can you do it in an environment with thousands of
simultaneous transactions
while maintaining security, stability, etc. It is those other criteria,
beyond performance, that
are why DB2, Oracle, and Informix are where they are while MySQL is
still begin given
away for free. And it is precisely why I think performance benchmarks
aren't worth the
paper on which they are printed. Just fluff that should be used as kindling.

--
Daniel Morgan
http://www.outreach.washington.edu/e...ad/oad_crs.asp
http://www.outreach.washington.edu/e...oa/aoa_crs.asp
da******@x.washington.edu
(replace 'x' with a 'u' to reply)

Nov 12 '05 #21

Bing Wu

Jan M. Nelken wrote:

Would you mind posting DDL for both tables? Inquiring minds want to know
and play with configuration a bit.

Jan M. Nelken

Full DDLs for these two tables attached below for your reference. Appericate for any comments.

Thanks,

Bing

DB2:-

CREATE TABLE "DB2ADMIN"."COORDINATE" (
"FID" INTEGER NOT NULL ,
"AID" INTEGER NOT NULL ,
"X" REAL NOT NULL ,
"Y" REAL NOT NULL ,
"Z" REAL NOT NULL )
IN "USERSPACE1" ;
ALTER TABLE "DB2ADMIN"."COORDINATE" APPEND ON;

CREATE INDEX "DB2ADMIN"."IDX_COORDINATE1" ON "DB2ADMIN"."COORDINATE"
("FID" ASC)
PCTFREE 0 ;

CREATE INDEX "DB2ADMIN"."IDX_COORDINATE2" ON "DB2ADMIN"."COORDINATE"
("AID" ASC)
PCTFREE 0 ;

ALTER TABLE "DB2ADMIN"."COORDINATE"
ADD CONSTRAINT "PK_COORDINATE8" PRIMARY KEY
("FID",
"AID");

CREATE BUFFERPOOL "USERPOOL" SIZE 10820 PAGESIZE 32768 NOT EXTENDED STORAGE;

ALTER TABLESPACE SYSCATSPACE
PREFETCHSIZE 32
OVERHEAD 24.100000
TRANSFERRATE 0.900000;
ALTER TABLESPACE TEMPSPACE1
PREFETCHSIZE 32
OVERHEAD 24.100000
TRANSFERRATE 0.900000;
ALTER TABLESPACE USERSPACE1
PREFETCHSIZE 32
OVERHEAD 24.100000
TRANSFERRATE 0.900000;

CREATE TABLE "DB2ADMIN"."FRAME" (
"ID" INTEGER NOT NULL ,
"TID" INTEGER NOT NULL ,
"SNUM" INTEGER NOT NULL WITH DEFAULT 1 ,
"TIME" REAL NOT NULL WITH DEFAULT 0 ,
"TYPE" CHAR(4) NOT NULL ,
"VOLUME" DOUBLE NOT NULL ,
"PRESSURE" DOUBLE NOT NULL ,
"TEMP" DOUBLE NOT NULL ,
"TEMP_S" DOUBLE NOT NULL ,
"TEMP_1" DOUBLE ,
"TEMP_2" DOUBLE ,
"TEMP_3" DOUBLE ,
"TEMP_4" DOUBLE ,
"TEMP_5" DOUBLE ,
"ENERGY" DOUBLE NOT NULL ,
"ENERGY_S" DOUBLE NOT NULL ,
"ENERGY_1" DOUBLE ,
"ENERGY_2" DOUBLE ,
"ENERGY_3" DOUBLE ,
"ENERGY_4" DOUBLE ,
"ENERGY_5" DOUBLE ,
"POTENTIAL" DOUBLE NOT NULL ,
"POTENTIAL_S" DOUBLE NOT NULL ,
"POTENTIAL_1" DOUBLE ,
"POTENTIAL_2" DOUBLE ,
"POTENTIAL_3" DOUBLE ,
"POTENTIAL_4" DOUBLE ,
"POTENTIAL_5" DOUBLE ,
"RMSD" DOUBLE WITH DEFAULT 0 ,
"RADIUS_OF_GYRATION" DOUBLE WITH DEFAULT 0 )
IN "USERSPACE1" ;
CREATE INDEX "DB2ADMIN"."IDX_FRAME1" ON "DB2ADMIN"."FRAME"
("TID" ASC)
PCTFREE 10 ;

CREATE INDEX "DB2ADMIN"."IDX_FRAME2" ON "DB2ADMIN"."FRAME"
("TYPE" ASC)
PCTFREE 10 ;

CREATE INDEX "DB2ADMIN"."IDX_FRAME3" ON "DB2ADMIN"."FRAME"
("SNUM" ASC)
PCTFREE 10 ;

CREATE INDEX "DB2ADMIN"."IDX_FRAME4" ON "DB2ADMIN"."FRAME"
("TIME" ASC)
PCTFREE 10 ;

ALTER TABLE "DB2ADMIN"."FRAME"
ADD CONSTRAINT "PK_FRAME3" PRIMARY KEY
("ID");
MySQL:-

CREATE TABLE FRAME (
ID INTEGER NOT NULL ,
TID INTEGER NOT NULL ,
SNUM INTEGER NOT NULL DEFAULT 1 ,
TIME REAL NOT NULL DEFAULT 0,
TYPE CHAR(4) NOT NULL ,
VOLUME DOUBLE NOT NULL ,
PRESSURE DOUBLE NOT NULL ,
TEMP DOUBLE NOT NULL ,
TEMP_S DOUBLE NOT NULL ,
TEMP_1 DOUBLE ,
TEMP_2 DOUBLE ,
TEMP_3 DOUBLE ,
TEMP_4 DOUBLE ,
TEMP_5 DOUBLE ,
ENERGY DOUBLE NOT NULL ,
ENERGY_S DOUBLE NOT NULL ,
ENERGY_1 DOUBLE ,
ENERGY_2 DOUBLE ,
ENERGY_3 DOUBLE ,
ENERGY_4 DOUBLE ,
ENERGY_5 DOUBLE ,
POTENTIAL DOUBLE NOT NULL ,
POTENTIAL_S DOUBLE NOT NULL ,
POTENTIAL_1 DOUBLE ,
POTENTIAL_2 DOUBLE ,
POTENTIAL_3 DOUBLE ,
POTENTIAL_4 DOUBLE ,
POTENTIAL_5 DOUBLE ,
RMSD DOUBLE DEFAULT 0 ,
RADIUS_OF_GYRATION DOUBLE DEFAULT 0,
PRIMARY KEY ID (ID),
INDEX (TID)
);
CREATE TABLE COORDINATE (
FID INTEGER NOT NULL ,
AID INTEGER NOT NULL ,
X REAL NOT NULL ,
Y REAL NOT NULL ,
Z REAL NOT NULL ,
PRIMARY KEY (FID,AID),
INDEX (FID),
INDEX (AID)
);

Nov 12 '05 #22

Mark A

Full DDLs for these two tables attached below for your reference. Appericate for any comments.
Thanks,

Bing

DB2:-

CREATE TABLE "DB2ADMIN"."COORDINATE" (
"FID" INTEGER NOT NULL ,
"AID" INTEGER NOT NULL ,
"X" REAL NOT NULL ,
"Y" REAL NOT NULL ,
"Z" REAL NOT NULL )
IN "USERSPACE1" ;
ALTER TABLE "DB2ADMIN"."COORDINATE" APPEND ON;

CREATE INDEX "DB2ADMIN"."IDX_COORDINATE1" ON "DB2ADMIN"."COORDINATE"
("FID" ASC)
PCTFREE 0 ;

CREATE INDEX "DB2ADMIN"."IDX_COORDINATE2" ON "DB2ADMIN"."COORDINATE"
("AID" ASC)
PCTFREE 0 ;

ALTER TABLE "DB2ADMIN"."COORDINATE"
ADD CONSTRAINT "PK_COORDINATE8" PRIMARY KEY
("FID",
"AID");

You should create the index DB2ADMIN.IDX_COORDINATE1 to be UNIQUE. Later
when you define the primary key, DB2 cannot use the existing index that you
created. Same with all other tables that have primary keys.

CREATE UNIQUE INDEX "DB2ADMIN"."IDX_COORDINATE1" ON "DB2ADMIN"."COORDINATE"
("FID" ASC)
PCTFREE 0 ;

With a large table, it would be better to have the index and table in
different tablespaces. If you are adding new rows to the table, then you
need to reconsider the PCTFREE 0 on the indexes. With the append clause, the
rows are inserted at the end of the table, but indexes are always in correct
order, and some percent free is probably advisable (depending on how often
your reorg). You are probably getting a lot of split index pages which can
slow down queries (unless you reorg them after each data load).

Nov 12 '05 #23

Bing Wu

Mark A wrote:

CREATE TABLE "DB2ADMIN"."COORDINATE" (
"FID" INTEGER NOT NULL ,
"AID" INTEGER NOT NULL ,
"X" REAL NOT NULL ,
"Y" REAL NOT NULL ,
"Z" REAL NOT NULL )
IN "USERSPACE1" ;
ALTER TABLE "DB2ADMIN"."COORDINATE" APPEND ON;

CREATE INDEX "DB2ADMIN"."IDX_COORDINATE1" ON "DB2ADMIN"."COORDINATE"
("FID" ASC)
PCTFREE 0 ;

CREATE INDEX "DB2ADMIN"."IDX_COORDINATE2" ON "DB2ADMIN"."COORDINATE"
("AID" ASC)
PCTFREE 0 ;

ALTER TABLE "DB2ADMIN"."COORDINATE"
ADD CONSTRAINT "PK_COORDINATE8" PRIMARY KEY
("FID",
"AID");

You should create the index DB2ADMIN.IDX_COORDINATE1 to be UNIQUE. Later
when you define the primary key, DB2 cannot use the existing index that you
created. Same with all other tables that have primary keys.

CREATE UNIQUE INDEX "DB2ADMIN"."IDX_COORDINATE1" ON "DB2ADMIN"."COORDINATE"
("FID" ASC)
PCTFREE 0 ;

Thanks for pointing out the problem. Just a doubt. Maybe I was wrong. Should the UNIQUE INDEX only be applied to columns with unique value? In the above table, 'fid' certainly is not unique.

Bing

Nov 12 '05 #24

Mark A

> Thanks for pointing out the problem. Just a doubt. Maybe I was wrong.
Should the UNIQUE INDEX only be applied to columns with unique value? In the
above table, 'fid' certainly is not unique.

Bing

My apologies. Your Primary key has 2 columns. In that case, you don't the
index DB2ADMIN.IDX_COORDINATE1, since DB2 will create an index on Primary
Key. Since the first column of the index created for the Primary Key is the
one you want, the other index is redundant, so you can just drop it.

Nov 12 '05 #25

Bing Wu

Ken wrote:

Daniel Morgan <da******@x.washington.edu> wrote in message news:<1068745990.118235@yasure>...

Hmmm, but I've never seen a mysql benchmark that wasn't questionable in validity (such as their comparisions to postgresql), or trivial in
implementation.

They're typically read-only activities on single-cpu boxes using small
databases. While that's a valid scenario - it hardly covers all the
bases - such as SMP or MPP servers, mixed workloads, concurrency,
backups, etc. And they're typically implemented by folks that don't
really know databases. E-Week? Please. And they compared mysql 4.0
to db2 7.2? Did they use innodb or isam for mysql? If not innodb
then mysql didn't have RI and it was an apples to oranges comparison.

MySQL has been very popular and active in the e-commerce world. I have a few projects implemented using MySQL in the past 4 years and no major complains. It's fast and cheap to run with native multi-thread support so it's ideal for SMP/MPP servers. It also has impressive concurrency support so multi-thousand access isn't a problem. However, one of the disadvantages is support. Since it's GPL-based, so not as well-supported (funded) as pure-commerical databases i.e. DB2 or Oracle. But it's growning very fast, esp. in SMB sectors. And it's free, so large ISPs are using MySQL to handle large transactions .

Bing

Nov 12 '05 #26

Bing Wu

Mark A wrote:

My apologies. Your Primary key has 2 columns. In that case, you don't the
index DB2ADMIN.IDX_COORDINATE1, since DB2 will create an index on Primary
Key. Since the first column of the index created for the Primary Key is the
one you want, the other index is redundant, so you can just drop it.

More questions following. I have Primary Key (FID, AID) in the table, but I have queries on FID only as well on AID only. Since DB2 auto-indexes the Primary Key, do I still need create the individual indexes for FID and AID to get better search reault? And I can specify reverse scan in INDEX while is this auto-enabled in a PK index?

Many thanks,

Bing

Nov 12 '05 #27

Mark A

"Bing Wu" <bi**@biop.ox.ac.uk> wrote in message
news:bp**********@news.ox.ac.uk...

Mark A wrote:
My apologies. Your Primary key has 2 columns. In that case, you don't the index DB2ADMIN.IDX_COORDINATE1, since DB2 will create an index on Primary Key. Since the first column of the index created for the Primary Key is the one you want, the other index is redundant, so you can just drop it.
More questions following. I have Primary Key (FID, AID) in the table, but

I have queries on FID only as well on AID only. Since DB2 auto-indexes the
Primary Key, do I still need create the individual indexes for FID and AID
to get better search reault? And I can specify reverse scan in INDEX while
is this auto-enabled in a PK index?
Many thanks,

Bing

Since DB2 creates a unique index on (FID, AID) if one does not already
exist, then you don't need another index on FID. But you do need an index on
AID since it is not the first column of the index for the primary key.
Another index on FID-only is redundant to the one created for the Primary
Key. Removing the redundant index will speed up data loads, reorgs, etc.

I don't know how that works in MYSQL, but for DB2 what I said applies..

Nov 12 '05 #28

Mark A

> > More questions following. I have Primary Key (FID, AID) in the table,
but

I have queries on FID only as well on AID only. Since DB2 auto-indexes the
Primary Key, do I still need create the individual indexes for FID and AID
to get better search reault? And I can specify reverse scan in INDEX while
is this auto-enabled in a PK index?

Many thanks,

Bing

Reverse scan is not auto-enabled for the index created by the primary key.
You can create the primary key unique index first manually with allowing
reverse scans (before defining the primary key) and then DB2 will use the
index you created instead of creating one on its own.

Nov 12 '05 #29

Bing Wu

Mark A wrote:

Since DB2 creates a unique index on (FID, AID) if one does not already
exist, then you don't need another index on FID. But you do need an index on
AID since it is not the first column of the index for the primary key.
Another index on FID-only is redundant to the one created for the Primary
Key. Removing the redundant index will speed up data loads, reorgs, etc.

Thanks for the comments. I do have other queries based on AIDs. So I'd assume I should create separate AID index. Since reverse scan on (FID, AID) won't make any sense for a query if it only searches based on either AID or FID.

Bing

Nov 12 '05 #30

Mark A

"> Mark A wrote:

> Since DB2 creates a unique index on (FID, AID) if one does not already
exist, then you don't need another index on FID. But you do need an index on AID since it is not the first column of the index for the primary key.
Another index on FID-only is redundant to the one created for the Primary Key. Removing the redundant index will speed up data loads, reorgs, etc.
Thanks for the comments. I do have other queries based on AIDs. So I'd

assume I should create separate AID index. Since reverse scan on (FID, AID)
won't make any sense for a query if it only searches based on either AID or
FID.
Bing

To effectively use the b-tree of the index (as opposed to scanning the
entire index) you must supply the first column(s) of the index in your
predicate. It is OK if you do not supply the trailing columns of the index.

So you need one unique index on (FID, AID) and one index on AID. Having
another index on FID-only is redundant, and not good for "overall"
performance.

To use the allow reverse scan feature of the index on (FID, AID), you will
need to create it as a unique index allowing reverse scans before you create
the primary key. This will allow DB2 to use this index and not create one of
its own as part of the primary key statement.

Nov 12 '05 #31

Bing Wu

Mark A wrote:

To effectively use the b-tree of the index (as opposed to scanning the
entire index) you must supply the first column(s) of the index in your
predicate. It is OK if you do not supply the trailing columns of the index.

So you need one unique index on (FID, AID) and one index on AID. Having
another index on FID-only is redundant, and not good for "overall"
performance.

To use the allow reverse scan feature of the index on (FID, AID), you will
need to create it as a unique index allowing reverse scans before you create
the primary key. This will allow DB2 to use this index and not create one of
its own as part of the primary key statement.

Thanks Mark. Will apply the changes and let's see if it make any changes on performance :-).

Bing

Nov 12 '05 #32

Mark A

> Mark A wrote:

To effectively use the b-tree of the index (as opposed to scanning the
entire index) you must supply the first column(s) of the index in your
predicate. It is OK if you do not supply the trailing columns of the index.
So you need one unique index on (FID, AID) and one index on AID. Having
another index on FID-only is redundant, and not good for "overall"
performance.

To use the allow reverse scan feature of the index on (FID, AID), you will need to create it as a unique index allowing reverse scans before you create the primary key. This will allow DB2 to use this index and not create one of its own as part of the primary key statement.
Thanks Mark. Will apply the changes and let's see if it make any changes

on performance :-).
Bing

It would probably help if you define a separate tablespace for the indexes
(defined in the create table statement).

Nov 12 '05 #33

Leo

Since you are getting index only access on both tables, how many disks
to you have? How any containers per tablespace? Are the tempspace,
and the two table index spaces (or table spaces if your using SMS) on
seperate physical disks? And how many NUM_IOSERVERS and
NUM_IOCLEANERS do you have defined?

Leo

db******@hotmail.com (Database Guy) wrote in message
news:<7f**************************@posting.google. com>...

Bing Wu <bi**@biop.ox.ac.uk> wrote in message news:<bo**********@news.ox.ac.uk>...
While I run the query against the 'coordinate' table:

select x,y,x from coordinate where fid<1000

It takes more than 10 minutes to finish. Don't really understand why?

Sounds as though this is just illustrating my point that with DB2 (and
I think most DBMS's) exporting a large resultset is often a bottleneck
on (I think) the coordinator agent. mySQL by all accounts is unusually
lightweight, and suspect it has a very different architecture that may
be the ideal choice if your sole criterion is speed of dumping tables
to falt file. Also suspect your query may even by serviced by a single
agent (thread) - maybe the reduction in inter-agent communication
overhead is a big advantage for queries that just dump tables???

Nobody's come back yet with suggestions on how to increase fetch rate
over 20,000/sec, so I guess you can take that as tacit acceptance that
this is more-or-less optimal for DB2 right now. I'd love a technical
explanation of why mySQL might be so much quicker under these
(specific) circumstances - is this basically inevitable?
DG

Nov 12 '05 #34

Leo

It looks to me like the tables are created in the USERSPACE1
tablespace, which by default is 4K page size. Then you have this:
"CREATE BUFFERPOOL "USERPOOL" SIZE 10820 PAGESIZE 32768 NOT EXTENDED
STORAGE;"

So you create a bufferpool with 32K page size. You cannot put a
tablespace define as 4K in a 32K page buffpool. Also I don't see
where you are alter any tablespaces to use the bufferpool you created.
I guess is that your tempspace and userspace1 are still using the
ibdefaultbp bufferpool, which is most likely the default size fo 1000
4K pages.

Drop the USERPOOL bufferpool, alter the IBMDEFAULTBP to be about
100000 pages, stop and start the database and try it again. I'm still
wondering about the number of IOCLEANERS and IOSERVERS, but this looks
like a bufferpool sizing issue to me. Or am I missing some detail
about your configuration?

Leo

this bufferpool is 32K page size. A tablespace with a 4K pagesize
cannot
Bing Wu <bi**@biop.ox.ac.uk> wrote in message news:<bp**********@news.ox.ac.uk>...

Jan M. Nelken wrote:

Would you mind posting DDL for both tables? Inquiring minds want to know
and play with configuration a bit.

Jan M. Nelken

Full DDLs for these two tables attached below for your reference. Appericate for any comments.

Thanks,

Bing

DB2:-

CREATE TABLE "DB2ADMIN"."COORDINATE" (
"FID" INTEGER NOT NULL ,
"AID" INTEGER NOT NULL ,
"X" REAL NOT NULL ,
"Y" REAL NOT NULL ,
"Z" REAL NOT NULL )
IN "USERSPACE1" ;
ALTER TABLE "DB2ADMIN"."COORDINATE" APPEND ON;

CREATE INDEX "DB2ADMIN"."IDX_COORDINATE1" ON "DB2ADMIN"."COORDINATE"
("FID" ASC)
PCTFREE 0 ;

CREATE INDEX "DB2ADMIN"."IDX_COORDINATE2" ON "DB2ADMIN"."COORDINATE"
("AID" ASC)
PCTFREE 0 ;

ALTER TABLE "DB2ADMIN"."COORDINATE"
ADD CONSTRAINT "PK_COORDINATE8" PRIMARY KEY
("FID",
"AID");

CREATE BUFFERPOOL "USERPOOL" SIZE 10820 PAGESIZE 32768 NOT EXTENDED STORAGE;

ALTER TABLESPACE SYSCATSPACE
PREFETCHSIZE 32
OVERHEAD 24.100000
TRANSFERRATE 0.900000;
ALTER TABLESPACE TEMPSPACE1
PREFETCHSIZE 32
OVERHEAD 24.100000
TRANSFERRATE 0.900000;
ALTER TABLESPACE USERSPACE1
PREFETCHSIZE 32
OVERHEAD 24.100000
TRANSFERRATE 0.900000;

CREATE TABLE "DB2ADMIN"."FRAME" (
"ID" INTEGER NOT NULL ,
"TID" INTEGER NOT NULL ,
"SNUM" INTEGER NOT NULL WITH DEFAULT 1 ,
"TIME" REAL NOT NULL WITH DEFAULT 0 ,
"TYPE" CHAR(4) NOT NULL ,
"VOLUME" DOUBLE NOT NULL ,
"PRESSURE" DOUBLE NOT NULL ,
"TEMP" DOUBLE NOT NULL ,
"TEMP_S" DOUBLE NOT NULL ,
"TEMP_1" DOUBLE ,
"TEMP_2" DOUBLE ,
"TEMP_3" DOUBLE ,
"TEMP_4" DOUBLE ,
"TEMP_5" DOUBLE ,
"ENERGY" DOUBLE NOT NULL ,
"ENERGY_S" DOUBLE NOT NULL ,
"ENERGY_1" DOUBLE ,
"ENERGY_2" DOUBLE ,
"ENERGY_3" DOUBLE ,
"ENERGY_4" DOUBLE ,
"ENERGY_5" DOUBLE ,
"POTENTIAL" DOUBLE NOT NULL ,
"POTENTIAL_S" DOUBLE NOT NULL ,
"POTENTIAL_1" DOUBLE ,
"POTENTIAL_2" DOUBLE ,
"POTENTIAL_3" DOUBLE ,
"POTENTIAL_4" DOUBLE ,
"POTENTIAL_5" DOUBLE ,
"RMSD" DOUBLE WITH DEFAULT 0 ,
"RADIUS_OF_GYRATION" DOUBLE WITH DEFAULT 0 )
IN "USERSPACE1" ;
CREATE INDEX "DB2ADMIN"."IDX_FRAME1" ON "DB2ADMIN"."FRAME"
("TID" ASC)
PCTFREE 10 ;

CREATE INDEX "DB2ADMIN"."IDX_FRAME2" ON "DB2ADMIN"."FRAME"
("TYPE" ASC)
PCTFREE 10 ;

CREATE INDEX "DB2ADMIN"."IDX_FRAME3" ON "DB2ADMIN"."FRAME"
("SNUM" ASC)
PCTFREE 10 ;

CREATE INDEX "DB2ADMIN"."IDX_FRAME4" ON "DB2ADMIN"."FRAME"
("TIME" ASC)
PCTFREE 10 ;

ALTER TABLE "DB2ADMIN"."FRAME"
ADD CONSTRAINT "PK_FRAME3" PRIMARY KEY
("ID");
MySQL:-

CREATE TABLE FRAME (
ID INTEGER NOT NULL ,
TID INTEGER NOT NULL ,
SNUM INTEGER NOT NULL DEFAULT 1 ,
TIME REAL NOT NULL DEFAULT 0,
TYPE CHAR(4) NOT NULL ,
VOLUME DOUBLE NOT NULL ,
PRESSURE DOUBLE NOT NULL ,
TEMP DOUBLE NOT NULL ,
TEMP_S DOUBLE NOT NULL ,
TEMP_1 DOUBLE ,
TEMP_2 DOUBLE ,
TEMP_3 DOUBLE ,
TEMP_4 DOUBLE ,
TEMP_5 DOUBLE ,
ENERGY DOUBLE NOT NULL ,
ENERGY_S DOUBLE NOT NULL ,
ENERGY_1 DOUBLE ,
ENERGY_2 DOUBLE ,
ENERGY_3 DOUBLE ,
ENERGY_4 DOUBLE ,
ENERGY_5 DOUBLE ,
POTENTIAL DOUBLE NOT NULL ,
POTENTIAL_S DOUBLE NOT NULL ,
POTENTIAL_1 DOUBLE ,
POTENTIAL_2 DOUBLE ,
POTENTIAL_3 DOUBLE ,
POTENTIAL_4 DOUBLE ,
POTENTIAL_5 DOUBLE ,
RMSD DOUBLE DEFAULT 0 ,
RADIUS_OF_GYRATION DOUBLE DEFAULT 0,
PRIMARY KEY ID (ID),
INDEX (TID)
);
CREATE TABLE COORDINATE (
FID INTEGER NOT NULL ,
AID INTEGER NOT NULL ,
X REAL NOT NULL ,
Y REAL NOT NULL ,
Z REAL NOT NULL ,
PRIMARY KEY (FID,AID),
INDEX (FID),
INDEX (AID)
);

Nov 12 '05 #35

Ken

Bing Wu <bi**@biop.ox.ac.uk> wrote in message news:<bp**********@news.ox.ac.uk>...

MySQL has been very popular and active in the e-commerce world. I

have a few projects implemented using MySQL in the past 4 years and no
major complains. It's fast and cheap to run with native multi-thread
support so it's ideal for SMP/MPP servers. It also has impressive
concurrency support so multi-thousand access isn't a problem. However,
one of the disadvantages is support. Since it's GPL-based, so not as
well-supported (funded) as pure-commerical databases i.e. DB2 or
Oracle. But it's growning very fast, esp. in SMB sectors. And it's
free, so large ISPs are using MySQL to handle large transactions .

Unfortunately the vendor, mysql-ab, has also repeatedly stated that
capablities like views, sub-selects, unions, stored procedures, and
referential integrity were only needed by 1% of the applications and a
waste of time for everyone else, that RI should be implemented by the
applications, that the only thing that counts is speed, etc. No
thanks - I remember what that was like back in the early 80s before
robust relational database management systems became widely available.
And as far as large ISPs using it to handle transactions is
concerned - I have only seen large ISPs offering it because of name
recognition. Mysql doesn't really offer transactions - it relies upon
a completely separate product to provide that (innodb).

So, for that reason my recommendation to my colleagues is to use
postgresql when they want something free. Postgresql has sufficient
capabilities that if they want to port the code to a genuine
relational database that the migration is fairly painless: they don't
end up pushing a ton of complexity into the application, but can allow
the database to handle it instead.

This view was recently confirmed when I migrated a large sql server
solution to postgresql. Although I hit a few snags, in general there
was a 1:1 mapping between data model components in sql server and
postgresql. Would not at all have been the case with mysql.

ken

Nov 12 '05 #36

Database Guy

"Mark A" <ma@switchboard.net> wrote in message news:<3m***************@news.uswest.net>...

It would probably help if you define a separate tablespace for the indexes
(defined in the create table statement).

Why? Will any of these suggestions help increase the fetch rate beyond
20,000/sec?

Personally I've never seen much higher than that, so if there are
stacks of people out there who've used DB2 and got much higher rates I
really want to know how.
DG

Nov 12 '05 #37

Daniel Morgan

Database Guy wrote:

"Mark A" <ma@switchboard.net> wrote in message news:<3m***************@news.uswest.net>...

It would probably help if you define a separate tablespace for the indexes
(defined in the create table statement).

Why? Will any of these suggestions help increase the fetch rate beyond
20,000/sec?

Personally I've never seen much higher than that, so if there are
stacks of people out there who've used DB2 and got much higher rates I
really want to know how.
DG

Plus separating tables and indexes has never done anything for anybody
in any database to improve performance. At least not in a multi-user
multi-transaction environment with various flavours of RAID.
--
Daniel Morgan
http://www.outreach.washington.edu/e...ad/oad_crs.asp
http://www.outreach.washington.edu/e...oa/aoa_crs.asp
da******@x.washington.edu
(replace 'x' with a 'u' to reply)

Nov 12 '05 #38

Mark A

> > "Mark A" <ma@switchboard.net> wrote in message

It would probably help if you define a separate tablespace for the indexes(defined in the create table statement).

Plus separating tables and indexes has never done anything for anybody
in any database to improve performance. At least not in a multi-user
multi-transaction environment with various flavours of RAID.
--
Daniel Morgan

If you followed this discussion from the beginning, it was stated that the
person was scanning and retrieving a very large number of rows in a table
with over 4 million rows. Sequential prefetch for the table and/or indexes
would probably be used in an optimal access path. In such cases, table and
index placement is important, as is extent size, prefetch size, and
bufferpool page size. The optimum configuration is somewhat different for
most flavors of RAID, and needs to be done carefully if performance is
paramount (the guy is complaining about performance).

I would agree that in a multi-user transaction system (definitely not the
system described here), dataset placement is not critical, especially with
decent sized buffer pools.

Nov 12 '05 #39

Daniel Morgan

Mark A wrote:

"Mark A" <ma@switchboard.net> wrote in message
It would probably help if you define a separate tablespace for the
indexes
(defined in the create table statement).

Plus separating tables and indexes has never done anything for anybody
in any database to improve performance. At least not in a multi-user
multi-transaction environment with various flavours of RAID.
--
Daniel Morgan

If you followed this discussion from the beginning, it was stated that the
person was scanning and retrieving a very large number of rows in a table
with over 4 million rows. Sequential prefetch for the table and/or indexes
would probably be used in an optimal access path. In such cases, table and
index placement is important, as is extent size, prefetch size, and
bufferpool page size. The optimum configuration is somewhat different for
most flavors of RAID, and needs to be done carefully if performance is
paramount (the guy is complaining about performance).

I would agree that in a multi-user transaction system (definitely not the
system described here), dataset placement is not critical, especially with
decent sized buffer pools.

I would suggest that it isn't important in the described system either.
To assume that sequential means that a head on a disk is doing one and
only one thing until it is done isn't real world. It is nice theory
but not real-world. With many operating systems, and I don't know what
operating system is being used here, one can't even guarantee that a
single file laid down on a disk will be contiguous except within a raw
partition.
--
Daniel Morgan
http://www.outreach.washington.edu/e...ad/oad_crs.asp
http://www.outreach.washington.edu/e...oa/aoa_crs.asp
da******@x.washington.edu
(replace 'x' with a 'u' to reply)

Nov 12 '05 #40

Mark A

> > If you followed this discussion from the beginning, it was stated that
the

person was scanning and retrieving a very large number of rows in a table with over 4 million rows. Sequential prefetch for the table and/or indexes would probably be used in an optimal access path. In such cases, table and index placement is important, as is extent size, prefetch size, and
bufferpool page size. The optimum configuration is somewhat different for most flavors of RAID, and needs to be done carefully if performance is
paramount (the guy is complaining about performance).

I would agree that in a multi-user transaction system (definitely not the system described here), dataset placement is not critical, especially with decent sized buffer pools.

I would suggest that it isn't important in the described system either.
To assume that sequential means that a head on a disk is doing one and
only one thing until it is done isn't real world. It is nice theory
but not real-world. With many operating systems, and I don't know what
operating system is being used here, one can't even guarantee that a
single file laid down on a disk will be contiguous except within a raw
partition.
--
Daniel Morgan

Maybe you need to do some research on DB2 sequential prefetch for tables and
indexes.

Nov 12 '05 #41

Daniel Morgan

Mark A wrote:

Maybe you need to do some research on DB2 sequential prefetch for tables and
indexes.

Oracle and Informix that have lots of things that work in a specific
way in theory too. But when we take them into the lab and look at
what is actually happening we find something very different.

Now this may not be true of DB2 on OS/390 or some other
hardware-operating system combination where Oracle has complete control.
But it is absolutely true with DB2, or any other RDBMS on, for example,
the Windows operating system on Intel hardware.

Let me quote from: https://aurora.vcu.edu/db2help/db2d0/frame3.htm#preftch

"Reading several consecutive pages into the buffer pool using a single
I/O operation can greatly reduce the overhead associated with running
your application. In addition, performing multiple I/O operations in
parallel to read in several ranges of pages at the same time can help
reduce the time your application needs to wait for I/O operations to
complete."

All very nice. But what does this have to do with separating tables
and indexes? And the answer on a real-world system is nothing. That it
is meaningless is something we have proven in the labs over and over
again. I'm not saying the prefetch isn't valuable. Prefetch was not part
of the converation. I said separating tables and indexes serves no
useful purpose.

The only value is in spreading I/O over as many spindles and heads as
possible and as evenly as possible. And once you have accomplished that
reading a page of index or a page of table is just that ... reading a
page. No RDBMS and no operating systems cares which is which ... the
overhead is the same. And if you stripe multiple physical disks into a
single logical disk ... the page can be anywhere.
--
Daniel Morgan
http://www.outreach.washington.edu/e...ad/oad_crs.asp
http://www.outreach.washington.edu/e...oa/aoa_crs.asp
da******@x.washington.edu
(replace 'x' with a 'u' to reply)

Nov 12 '05 #42

Mark A

Oracle and Informix that have lots of things that work in a specific
way in theory too. But when we take them into the lab and look at
what is actually happening we find something very different.

Now this may not be true of DB2 on OS/390 or some other
hardware-operating system combination where Oracle has complete control.
But it is absolutely true with DB2, or any other RDBMS on, for example,
the Windows operating system on Intel hardware.

Let me quote from: https://aurora.vcu.edu/db2help/db2d0/frame3.htm#preftch

"Reading several consecutive pages into the buffer pool using a single
I/O operation can greatly reduce the overhead associated with running
your application. In addition, performing multiple I/O operations in
parallel to read in several ranges of pages at the same time can help
reduce the time your application needs to wait for I/O operations to
complete."

All very nice. But what does this have to do with separating tables
and indexes? And the answer on a real-world system is nothing. That it
is meaningless is something we have proven in the labs over and over
again. I'm not saying the prefetch isn't valuable. Prefetch was not part
of the converation. I said separating tables and indexes serves no
useful purpose.

The only value is in spreading I/O over as many spindles and heads as
possible and as evenly as possible. And once you have accomplished that
reading a page of index or a page of table is just that ... reading a
page. No RDBMS and no operating systems cares which is which ... the
overhead is the same. And if you stripe multiple physical disks into a
single logical disk ... the page can be anywhere.
--
Daniel Morgan

In DB2,putting tables and indexes in separate tablespaces is a prerequisite
for putting them on different drives and in different bufferpools. I thought
that was understood, but maybe not. So your statement that it serves no
useful purpose is incorrect.

Nov 12 '05 #43

Daniel Morgan

Mark A wrote:

In DB2,putting tables and indexes in separate tablespaces is a prerequisite
for putting them on different drives and in different bufferpools. I thought
that was understood, but maybe not. So your statement that it serves no
useful purpose is incorrect.

Since when haven't you been able to install DB2 on a RAID 0 drive array
(by which I mean RAID 0, RAID 0+1 and RAID 1+0)?

Assuming DB2 still works on RAID arrays your statement is not valid. The
The relationship between tablespaces and drives is logical; not physical.
--
Daniel Morgan
http://www.outreach.washington.edu/e...ad/oad_crs.asp
http://www.outreach.washington.edu/e...oa/aoa_crs.asp
da******@x.washington.edu
(replace 'x' with a 'u' to reply)

Nov 12 '05 #44

Serge Rielau

Mark's second point however reamins in principle:
bufferpools can be associated with tablespaces. And hence separating
index and table allows to prefer index pages over nornmal rows in the
buffer-pool.
Now I'd like to agree that in the scenario described above this would
likely not matter. An it requires deep understanding of your system to
properly opimize. (Just like stick shift requires a deeper understanding
of how your car works :-)

Anyway I woudl liek to propose an experiment which I will do at a
different spot in this thread.

Cheers
Serge

--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab

Nov 12 '05 #45

Serge Rielau

bing,

Let's try the following experiment:
SELECT COUNT(*) FROM (SELECT
case WHEN x <> y then cast(raise_error('70000', 'dummy') end,
case WHEN x <> y then cast(raise_error('70000', 'dummy') end FROM
COORDINATE AS C, FRAME AS F WHERE F.TID=1 AND C.FID=F.ID AND F.ID<1000
AND F.ID>0 ) AS T ORDER BY C.FID,C.AID;
vs.
SELECT
case WHEN x <> y then cast(raise_error('70000', 'dummy') end,
case WHEN x <> y then cast(raise_error('70000', 'dummy') end FROM
COORDINATE AS C, FRAME AS F WHERE F.TID=1 AND C.FID=F.ID AND F.ID<1000
AND F.ID>0 ORDER BY C.FID,C.AID;

What would this experiment do?
First the case with raise_error() will ensure DB2 doesn't simplify the
query (can't drop the raise_error()) Be conscious about placing the
WHERE in the inner select!
Now. Using the count(*) will cut out all the network traffic, so we'll
get a fair idea of how much work DB2 (the engine) in doing vs. how much
is spent in the client-server communication.
If the cost is in the engine all those nice proposals about indexes,
bufferpool, etc may be relevant. If not, they are pointless.
The one dial that hasn't been mentioned yet is the communication buffer.
You should already get BLOCKING (i.e prefetching of the resultset into
the comm buffer) Now we need to dial up the buffer itself.
I think it's the DB2COMM special register, likely a client side thing
(never trust the SQL compiler guy ;-)

Cheers
Serge
--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab

Nov 12 '05 #46

Database Guy

"Mark A" <ma@switchboard.net> wrote in message news:<gk******************@news.uswest.net>...

Sequential prefetch for the table and/or indexes
would probably be used in an optimal access path. In such cases, table and
index placement is important, as is extent size, prefetch size, and
bufferpool page size.

Think Bing stated he's running on a single disk. He also stated that
DB2 is "slow" (10 minutes) when doing a simple select returning many
rows. So I still feel 20,000 rows/sec is all he's likely to get.
There's no point making DB2 run the query faster internally if the
coordinator/client interaction can't spool the results out fast
enough.

Perhaps Bing could try timing the following query, to test your theory
that i/o throughput is a problem and that fetching 20,000/sec can be
improved on:

select x, y, z from coordinate where fid < 1000
and (x + y) = (z * -1) - 1;

The purpose of the second, added line (non-indexable) is to reduce the
resultset. If the query runs much quicker than 10 minutes then it
indicates the extent to which fetch rate is the bottleneck.

Bing, if you read this then could you also let us know many rows your
10 minute query returned:

select x, y, z from coordinate where fid < 1000
DG

Nov 12 '05 #47

Gert van der Kooij

In article <bp**********@hanover.torolab.ibm.com>, sr*****@ca.eye-
bee-m.com says...

The one dial that hasn't been mentioned yet is the communication buffer.
You should already get BLOCKING (i.e prefetching of the resultset into
the comm buffer) Now we need to dial up the buffer itself.
I think it's the DB2COMM special register, likely a client side thing
(never trust the SQL compiler guy ;-)

You're right:)
It's the DBM CFG parm RQRIOBLK (Max requester I/O block size ).
By default it's 32767, maximum is 65535.

But it is used only when connecting from a remote client.
I don't know enough about the internal memory sizes for local
clients, but the RQRIOBLK size matching with the OPTIMIZE FOR n ROWS
clause on a select improves performance for remote clients.

Nov 12 '05 #48

Blair Adamache

The operating system being used it almost certainly Linux.

Daniel Morgan wrote:

... I don't know what
operating system is being used here, one can't even guarantee that a
single file laid down on a disk will be contiguous except within a raw
partition.

Nov 12 '05 #49

Daniel Morgan

Blair Adamache wrote:

The operating system being used it almost certainly Linux.

Daniel Morgan wrote:
... I don't know what operating system is being used here, one can't
even guarantee that a single file laid down on a disk will be
contiguous except within a raw
partition.

Can't speak to a generic, much less any specific Linux version. But
some operating systems are designed to attempt optimization by
spreading a single file across a disk even when it is a single file
on a brand new disk. May not affect you but something to consider.
--
Daniel Morgan
http://www.outreach.washington.edu/e...ad/oad_crs.asp
http://www.outreach.washington.edu/e...oa/aoa_crs.asp
da******@x.washington.edu
(replace 'x' with a 'u' to reply)

Nov 12 '05 #50

DB2 vs MySQL - performance on large tables

Similar topics