473,839 Members | 1,509 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

DB2 vs MySQL - performance on large tables

Hi all,

I am running a database containing large datasets:

frames: 20 thousand rows,
coordinates: 170 million row.

The database has been implemented with:

IBM DB2 v8.1
MySQL v3.23.54

using similar DDL and same indexes. And I have run a SQL joining two tables:

select c.x,c.y,c.z from coordinate as c,frame as f
where fid<1000 and f.tid=1 and f.id=c.fid

The query operates on 3400 billion possible joints and returns about 8.5 million records. It seems MySQL outperforms DB2 well ahead:

DB2: 7 minutes
MySQL: 3 minutes

MySQL is running with default settings while DB2:

Buffer pool: 500 MB (1GB system memory)
More seetings see: previous message titled -
"Improve query performance on large table"

Should I assume MySQL does run much faster than DB2 on large tables? Could anyone advice? BTW, one thing I should also mention that returned data formats:

DB2: 1.279200e+01 -1.596900e+01 -3.979500e+01
MySQL: 12.792 -15.969 -39.795

(x,y,z are defined as REAL in both databases)

I doubt this could make the performance difference.
Best regards,

Bing


Nov 12 '05
57 25542
"> Mark A wrote:
> Since DB2 creates a unique index on (FID, AID) if one does not already
exist, then you don't need another index on FID. But you do need an index on AID since it is not the first column of the index for the primary key.
Another index on FID-only is redundant to the one created for the Primary Key. Removing the redundant index will speed up data loads, reorgs, etc.
Thanks for the comments. I do have other queries based on AIDs. So I'd

assume I should create separate AID index. Since reverse scan on (FID, AID)
won't make any sense for a query if it only searches based on either AID or
FID.
Bing

To effectively use the b-tree of the index (as opposed to scanning the
entire index) you must supply the first column(s) of the index in your
predicate. It is OK if you do not supply the trailing columns of the index.

So you need one unique index on (FID, AID) and one index on AID. Having
another index on FID-only is redundant, and not good for "overall"
performance.

To use the allow reverse scan feature of the index on (FID, AID), you will
need to create it as a unique index allowing reverse scans before you create
the primary key. This will allow DB2 to use this index and not create one of
its own as part of the primary key statement.
Nov 12 '05 #31
Mark A wrote:
To effectively use the b-tree of the index (as opposed to scanning the
entire index) you must supply the first column(s) of the index in your
predicate. It is OK if you do not supply the trailing columns of the index.

So you need one unique index on (FID, AID) and one index on AID. Having
another index on FID-only is redundant, and not good for "overall"
performance.

To use the allow reverse scan feature of the index on (FID, AID), you will
need to create it as a unique index allowing reverse scans before you create
the primary key. This will allow DB2 to use this index and not create one of
its own as part of the primary key statement.

Thanks Mark. Will apply the changes and let's see if it make any changes on performance :-).

Bing
Nov 12 '05 #32
> Mark A wrote:
To effectively use the b-tree of the index (as opposed to scanning the
entire index) you must supply the first column(s) of the index in your
predicate. It is OK if you do not supply the trailing columns of the index.
So you need one unique index on (FID, AID) and one index on AID. Having
another index on FID-only is redundant, and not good for "overall"
performance.

To use the allow reverse scan feature of the index on (FID, AID), you will need to create it as a unique index allowing reverse scans before you create the primary key. This will allow DB2 to use this index and not create one of its own as part of the primary key statement.
Thanks Mark. Will apply the changes and let's see if it make any changes

on performance :-).
Bing

It would probably help if you define a separate tablespace for the indexes
(defined in the create table statement).
Nov 12 '05 #33
Leo
Since you are getting index only access on both tables, how many disks
to you have? How any containers per tablespace? Are the tempspace,
and the two table index spaces (or table spaces if your using SMS) on
seperate physical disks? And how many NUM_IOSERVERS and
NUM_IOCLEANERS do you have defined?

Leo

db******@hotmai l.com (Database Guy) wrote in message
news:<7f******* *************** ****@posting.go ogle.com>...
Bing Wu <bi**@biop.ox.a c.uk> wrote in message news:<bo******* ***@news.ox.ac. uk>...
While I run the query against the 'coordinate' table:

select x,y,x from coordinate where fid<1000

It takes more than 10 minutes to finish. Don't really understand why?


Sounds as though this is just illustrating my point that with DB2 (and
I think most DBMS's) exporting a large resultset is often a bottleneck
on (I think) the coordinator agent. mySQL by all accounts is unusually
lightweight, and suspect it has a very different architecture that may
be the ideal choice if your sole criterion is speed of dumping tables
to falt file. Also suspect your query may even by serviced by a single
agent (thread) - maybe the reduction in inter-agent communication
overhead is a big advantage for queries that just dump tables???

Nobody's come back yet with suggestions on how to increase fetch rate
over 20,000/sec, so I guess you can take that as tacit acceptance that
this is more-or-less optimal for DB2 right now. I'd love a technical
explanation of why mySQL might be so much quicker under these
(specific) circumstances - is this basically inevitable?
DG

Nov 12 '05 #34
Leo
It looks to me like the tables are created in the USERSPACE1
tablespace, which by default is 4K page size. Then you have this:
"CREATE BUFFERPOOL "USERPOOL" SIZE 10820 PAGESIZE 32768 NOT EXTENDED
STORAGE;"

So you create a bufferpool with 32K page size. You cannot put a
tablespace define as 4K in a 32K page buffpool. Also I don't see
where you are alter any tablespaces to use the bufferpool you created.
I guess is that your tempspace and userspace1 are still using the
ibdefaultbp bufferpool, which is most likely the default size fo 1000
4K pages.

Drop the USERPOOL bufferpool, alter the IBMDEFAULTBP to be about
100000 pages, stop and start the database and try it again. I'm still
wondering about the number of IOCLEANERS and IOSERVERS, but this looks
like a bufferpool sizing issue to me. Or am I missing some detail
about your configuration?

Leo

this bufferpool is 32K page size. A tablespace with a 4K pagesize
cannot
Bing Wu <bi**@biop.ox.a c.uk> wrote in message news:<bp******* ***@news.ox.ac. uk>...
Jan M. Nelken wrote:

Would you mind posting DDL for both tables? Inquiring minds want to know
and play with configuration a bit.

Jan M. Nelken


Full DDLs for these two tables attached below for your reference. Appericate for any comments.

Thanks,

Bing

DB2:-

CREATE TABLE "DB2ADMIN"."COO RDINATE" (
"FID" INTEGER NOT NULL ,
"AID" INTEGER NOT NULL ,
"X" REAL NOT NULL ,
"Y" REAL NOT NULL ,
"Z" REAL NOT NULL )
IN "USERSPACE1 " ;
ALTER TABLE "DB2ADMIN"."COO RDINATE" APPEND ON;

CREATE INDEX "DB2ADMIN"."IDX _COORDINATE1" ON "DB2ADMIN"."COO RDINATE"
("FID" ASC)
PCTFREE 0 ;

CREATE INDEX "DB2ADMIN"."IDX _COORDINATE2" ON "DB2ADMIN"."COO RDINATE"
("AID" ASC)
PCTFREE 0 ;

ALTER TABLE "DB2ADMIN"."COO RDINATE"
ADD CONSTRAINT "PK_COORDINATE8 " PRIMARY KEY
("FID",
"AID");

CREATE BUFFERPOOL "USERPOOL" SIZE 10820 PAGESIZE 32768 NOT EXTENDED STORAGE;

ALTER TABLESPACE SYSCATSPACE
PREFETCHSIZE 32
OVERHEAD 24.100000
TRANSFERRATE 0.900000;
ALTER TABLESPACE TEMPSPACE1
PREFETCHSIZE 32
OVERHEAD 24.100000
TRANSFERRATE 0.900000;
ALTER TABLESPACE USERSPACE1
PREFETCHSIZE 32
OVERHEAD 24.100000
TRANSFERRATE 0.900000;

CREATE TABLE "DB2ADMIN"."FRA ME" (
"ID" INTEGER NOT NULL ,
"TID" INTEGER NOT NULL ,
"SNUM" INTEGER NOT NULL WITH DEFAULT 1 ,
"TIME" REAL NOT NULL WITH DEFAULT 0 ,
"TYPE" CHAR(4) NOT NULL ,
"VOLUME" DOUBLE NOT NULL ,
"PRESSURE" DOUBLE NOT NULL ,
"TEMP" DOUBLE NOT NULL ,
"TEMP_S" DOUBLE NOT NULL ,
"TEMP_1" DOUBLE ,
"TEMP_2" DOUBLE ,
"TEMP_3" DOUBLE ,
"TEMP_4" DOUBLE ,
"TEMP_5" DOUBLE ,
"ENERGY" DOUBLE NOT NULL ,
"ENERGY_S" DOUBLE NOT NULL ,
"ENERGY_1" DOUBLE ,
"ENERGY_2" DOUBLE ,
"ENERGY_3" DOUBLE ,
"ENERGY_4" DOUBLE ,
"ENERGY_5" DOUBLE ,
"POTENTIAL" DOUBLE NOT NULL ,
"POTENTIAL_ S" DOUBLE NOT NULL ,
"POTENTIAL_ 1" DOUBLE ,
"POTENTIAL_ 2" DOUBLE ,
"POTENTIAL_ 3" DOUBLE ,
"POTENTIAL_ 4" DOUBLE ,
"POTENTIAL_ 5" DOUBLE ,
"RMSD" DOUBLE WITH DEFAULT 0 ,
"RADIUS_OF_GYRA TION" DOUBLE WITH DEFAULT 0 )
IN "USERSPACE1 " ;
CREATE INDEX "DB2ADMIN"."IDX _FRAME1" ON "DB2ADMIN"."FRA ME"
("TID" ASC)
PCTFREE 10 ;

CREATE INDEX "DB2ADMIN"."IDX _FRAME2" ON "DB2ADMIN"."FRA ME"
("TYPE" ASC)
PCTFREE 10 ;

CREATE INDEX "DB2ADMIN"."IDX _FRAME3" ON "DB2ADMIN"."FRA ME"
("SNUM" ASC)
PCTFREE 10 ;

CREATE INDEX "DB2ADMIN"."IDX _FRAME4" ON "DB2ADMIN"."FRA ME"
("TIME" ASC)
PCTFREE 10 ;

ALTER TABLE "DB2ADMIN"."FRA ME"
ADD CONSTRAINT "PK_FRAME3" PRIMARY KEY
("ID");
MySQL:-

CREATE TABLE FRAME (
ID INTEGER NOT NULL ,
TID INTEGER NOT NULL ,
SNUM INTEGER NOT NULL DEFAULT 1 ,
TIME REAL NOT NULL DEFAULT 0,
TYPE CHAR(4) NOT NULL ,
VOLUME DOUBLE NOT NULL ,
PRESSURE DOUBLE NOT NULL ,
TEMP DOUBLE NOT NULL ,
TEMP_S DOUBLE NOT NULL ,
TEMP_1 DOUBLE ,
TEMP_2 DOUBLE ,
TEMP_3 DOUBLE ,
TEMP_4 DOUBLE ,
TEMP_5 DOUBLE ,
ENERGY DOUBLE NOT NULL ,
ENERGY_S DOUBLE NOT NULL ,
ENERGY_1 DOUBLE ,
ENERGY_2 DOUBLE ,
ENERGY_3 DOUBLE ,
ENERGY_4 DOUBLE ,
ENERGY_5 DOUBLE ,
POTENTIAL DOUBLE NOT NULL ,
POTENTIAL_S DOUBLE NOT NULL ,
POTENTIAL_1 DOUBLE ,
POTENTIAL_2 DOUBLE ,
POTENTIAL_3 DOUBLE ,
POTENTIAL_4 DOUBLE ,
POTENTIAL_5 DOUBLE ,
RMSD DOUBLE DEFAULT 0 ,
RADIUS_OF_GYRAT ION DOUBLE DEFAULT 0,
PRIMARY KEY ID (ID),
INDEX (TID)
);
CREATE TABLE COORDINATE (
FID INTEGER NOT NULL ,
AID INTEGER NOT NULL ,
X REAL NOT NULL ,
Y REAL NOT NULL ,
Z REAL NOT NULL ,
PRIMARY KEY (FID,AID),
INDEX (FID),
INDEX (AID)
);

Nov 12 '05 #35
Ken
Bing Wu <bi**@biop.ox.a c.uk> wrote in message news:<bp******* ***@news.ox.ac. uk>...
MySQL has been very popular and active in the e-commerce world. I

have a few projects implemented using MySQL in the past 4 years and no
major complains. It's fast and cheap to run with native multi-thread
support so it's ideal for SMP/MPP servers. It also has impressive
concurrency support so multi-thousand access isn't a problem. However,
one of the disadvantages is support. Since it's GPL-based, so not as
well-supported (funded) as pure-commerical databases i.e. DB2 or
Oracle. But it's growning very fast, esp. in SMB sectors. And it's
free, so large ISPs are using MySQL to handle large transactions .

Unfortunately the vendor, mysql-ab, has also repeatedly stated that
capablities like views, sub-selects, unions, stored procedures, and
referential integrity were only needed by 1% of the applications and a
waste of time for everyone else, that RI should be implemented by the
applications, that the only thing that counts is speed, etc. No
thanks - I remember what that was like back in the early 80s before
robust relational database management systems became widely available.
And as far as large ISPs using it to handle transactions is
concerned - I have only seen large ISPs offering it because of name
recognition. Mysql doesn't really offer transactions - it relies upon
a completely separate product to provide that (innodb).

So, for that reason my recommendation to my colleagues is to use
postgresql when they want something free. Postgresql has sufficient
capabilities that if they want to port the code to a genuine
relational database that the migration is fairly painless: they don't
end up pushing a ton of complexity into the application, but can allow
the database to handle it instead.

This view was recently confirmed when I migrated a large sql server
solution to postgresql. Although I hit a few snags, in general there
was a 1:1 mapping between data model components in sql server and
postgresql. Would not at all have been the case with mysql.

ken
Nov 12 '05 #36
"Mark A" <ma@switchboard .net> wrote in message news:<3m******* ********@news.u swest.net>...
It would probably help if you define a separate tablespace for the indexes
(defined in the create table statement).


Why? Will any of these suggestions help increase the fetch rate beyond
20,000/sec?

Personally I've never seen much higher than that, so if there are
stacks of people out there who've used DB2 and got much higher rates I
really want to know how.
DG
Nov 12 '05 #37
Database Guy wrote:
"Mark A" <ma@switchboard .net> wrote in message news:<3m******* ********@news.u swest.net>...

It would probably help if you define a separate tablespace for the indexes
(defined in the create table statement).

Why? Will any of these suggestions help increase the fetch rate beyond
20,000/sec?

Personally I've never seen much higher than that, so if there are
stacks of people out there who've used DB2 and got much higher rates I
really want to know how.
DG


Plus separating tables and indexes has never done anything for anybody
in any database to improve performance. At least not in a multi-user
multi-transaction environment with various flavours of RAID.
--
Daniel Morgan
http://www.outreach.washington.edu/e...ad/oad_crs.asp
http://www.outreach.washington.edu/e...oa/aoa_crs.asp
da******@x.wash ington.edu
(replace 'x' with a 'u' to reply)

Nov 12 '05 #38
> > "Mark A" <ma@switchboard .net> wrote in message
It would probably help if you define a separate tablespace for the indexes(defined in the create table statement).


Plus separating tables and indexes has never done anything for anybody
in any database to improve performance. At least not in a multi-user
multi-transaction environment with various flavours of RAID.
--
Daniel Morgan


If you followed this discussion from the beginning, it was stated that the
person was scanning and retrieving a very large number of rows in a table
with over 4 million rows. Sequential prefetch for the table and/or indexes
would probably be used in an optimal access path. In such cases, table and
index placement is important, as is extent size, prefetch size, and
bufferpool page size. The optimum configuration is somewhat different for
most flavors of RAID, and needs to be done carefully if performance is
paramount (the guy is complaining about performance).

I would agree that in a multi-user transaction system (definitely not the
system described here), dataset placement is not critical, especially with
decent sized buffer pools.
Nov 12 '05 #39
Mark A wrote:
"Mark A" <ma@switchboard .net> wrote in message
It would probably help if you define a separate tablespace for the
indexes
(defined in the create table statement).


Plus separating tables and indexes has never done anything for anybody
in any database to improve performance. At least not in a multi-user
multi-transaction environment with various flavours of RAID.
--
Daniel Morgan

If you followed this discussion from the beginning, it was stated that the
person was scanning and retrieving a very large number of rows in a table
with over 4 million rows. Sequential prefetch for the table and/or indexes
would probably be used in an optimal access path. In such cases, table and
index placement is important, as is extent size, prefetch size, and
bufferpool page size. The optimum configuration is somewhat different for
most flavors of RAID, and needs to be done carefully if performance is
paramount (the guy is complaining about performance).

I would agree that in a multi-user transaction system (definitely not the
system described here), dataset placement is not critical, especially with
decent sized buffer pools.


I would suggest that it isn't important in the described system either.
To assume that sequential means that a head on a disk is doing one and
only one thing until it is done isn't real world. It is nice theory
but not real-world. With many operating systems, and I don't know what
operating system is being used here, one can't even guarantee that a
single file laid down on a disk will be contiguous except within a raw
partition.
--
Daniel Morgan
http://www.outreach.washington.edu/e...ad/oad_crs.asp
http://www.outreach.washington.edu/e...oa/aoa_crs.asp
da******@x.wash ington.edu
(replace 'x' with a 'u' to reply)

Nov 12 '05 #40

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
6827
by: Kevin Crosbie | last post by:
Hi all, (Sorry for the cross-post, there doesn't appear to be much activity on comp.database.oracle)I'm trying to get the last 300 rows from the difference between 2 large tables and my queries are taking at least 10 minutes to do this. I'm running on quite a fast server and the tables are not very large, 3,000,000 to 30,000,000 rows. I've tried the following:
0
1780
by: Joseph Norris | last post by:
Group, I have been working with Mysql for about 5 years - mainly in LAMP shops. The tables have been between 20-100 thousand records size. Now I have a project where the tables are in the millions of records. This is very new to me and I am noticing that my queries are really slowwwwww! What are the options that I have to speed my queries on the mysql side with
3
1575
by: Jeremy Howard | last post by:
I am finding delete queries on large InnoDB tables very slow - are there ways to speed this up? I have a table with about 100 million rows: I am trying to delete just a few of these rows (the following select takes a couple of seconds): > SELECT count(*) -> FROM UserSnap
1
2123
by: boonkit | last post by:
Hi all, I cannot find MySQl performance info at linux "top" command. I already tested on slow MySQL query that took more than 5 minutes (create fulltext index), but during the 5 minutes, i cannot find any mysql process at top command. Below is the system information: Linux version: 2.6.9-1.667smp (Red Hat 3.4.2-6.fc3) MySQL version: 4.1.11 (source installation)
1
1409
by: steve | last post by:
I am researching mysql performance optimization. Does anyone who good books, or good url’s for this. I have seen a couple of good books on Amazon, but if you have read anything good, let me know. -- Posted using the http://www.dbforumz.com interface, at author's request Articles individually checked for conformance to usenet standards Topic URL: http://www.dbforumz.com/mySQL-Good-resources-performance-optimization-ftopict220297.html...
2
2326
by: Shashikant Kore | last post by:
Hi, I am using MySQL for a table which will have 100M+ records, avg length of records being 130 bytes. When the number of records reach approx. 25M (and the file size close to 4GB), the rate of inserts falls drastically from 800 per second to 30-40 per second. Details: * MySQL 3.23.58 on Fedora Core 3
1
2014
by: Dino Vliet | last post by:
Hi folks, I'm busy with analyzing some data and therefore will have to store 2 big tables containing 50 million obs & 25 million obs. Selecting some interesting obs can decrease these tables with maybe a factor 10 but eventually I will have to combine them using a join. These tables contain almost 20 colums and after selecting the appropriate ones I will have maybe 10 colums.
1
2402
by: murray | last post by:
Large table performance problem with MYSQL performance with ASP.NET I am doing some benchmarking of an application using MYSQL under Windows and I confused about the results. Environment Emachine 6417 with amd3200+ 64 bit processor 3.3GB ram mysql 4.1 with odbc 3.51 MYISAM windows 2003 server std edition
0
1351
by: OctoTools | last post by:
Large table performance problem with MYSQL performance with ASP.NET I am doing some benchmarking of an application using MYSQL under Windows and I confused about the results. Environment Emachine 6417 with amd3200+ 64 bit processor 3.3GB ram mysql 4.1 with odbc 3.51 MYISAM windows 2003 server std edition
0
9855
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9697
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10292
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9426
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
7017
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5682
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5866
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
4064
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3132
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.