consider the following hypothetical (it isn't but....)
- start with a COBOL/VSAM codebase that's REALLY old
- end with both a COBOL/VSAM and a COBOL/DB2 system which differs in only
where the data lives. code, including copybooks, remains the same.
mostly the system does periodic (daily/weekly/<etc>) batch runs.
the VSAM order file has 99 line items. this gets converted to an
Order table with 99 line_items.
beyond the need to fit the copybook definition, it is asserted that a
two table (Order and Order_line) implementation will be too slow.
this rumination was motivated by reading an article linked to from an
earlier thread, which discussed join implementation on 390. i suspect what
it had to say applies generally. what caught me was, as i understood it,
that nested table reads is most often used. if this is true, and it seems
that hash joins are only more efficient on equality constraints, then is
there a known analysis which at least mitigates the reading?
what (hypothetically ) we tend to do is put each table in a tablespace.
for the Order/Order_line implementation, it seems logical to put them into
one tablespace, cluster Order on Order_num and Order_line on Order_num,
line_num; and buffer the hell out of it. similarly for indexes.
does this sound remotely on the right track? 3 3073 gn*****@rcn.com (robert) wrote in message news:<da******* *************** ****@posting.go ogle.com>... consider the following hypothetical (it isn't but....)
- start with a COBOL/VSAM codebase that's REALLY old - end with both a COBOL/VSAM and a COBOL/DB2 system which differs in only where the data lives. code, including copybooks, remains the same.
mostly the system does periodic (daily/weekly/<etc>) batch runs.
the VSAM order file has 99 line items. this gets converted to an Order table with 99 line_items.
beyond the need to fit the copybook definition, it is asserted that a two table (Order and Order_line) implementation will be too slow.
this rumination was motivated by reading an article linked to from an earlier thread, which discussed join implementation on 390. i suspect what it had to say applies generally. what caught me was, as i understood it, that nested table reads is most often used. if this is true, and it seems that hash joins are only more efficient on equality constraints, then is there a known analysis which at least mitigates the reading?
what (hypothetically ) we tend to do is put each table in a tablespace. for the Order/Order_line implementation, it seems logical to put them into one tablespace, cluster Order on Order_num and Order_line on Order_num, line_num; and buffer the hell out of it. similarly for indexes.
does this sound remotely on the right track?
Assuming that you never have an SQL statement result in a tablespace
scan, it might be OK to put the Order/Order_line in one simple
tablespace. But if a tablespace scan does occur, DB2 will
(unnecessarily) scan both tables, when it could have just scanned the
table needed if they were in separate tablespaces.
If you use a segmented tablespace, this will not help you since the
data for two tables in a single segmented tablespace will not be on
the same page (the lines items and the associated order), they will
reside in different segments.
I think you are over designing just a bit. By using DB2 buffer pools
effectively (this is the biggest difference between how DB2 and plain
VSAM works), I don't think you need to put the two tables in the same
tablespace.
Have one bufferpool for the catalog, indexes, and small tables that
frequently accessed. Second bufferpool for medium and large tables.
Third bufferpool for large decision support tables (if you have any in
your application).
The speed of a join has nothing to do with whether the tables are
using the same tablespace. The speed is related to whether the
required data page is already in the bufferpool, or if it needs to be
fetched from disk. m0****@yahoo.co m (Mark) wrote in message news:<a5******* *************** ****@posting.go ogle.com>... gn*****@rcn.com (robert) wrote in message news:<da******* *************** ****@posting.go ogle.com>... consider the following hypothetical (it isn't but....)
- start with a COBOL/VSAM codebase that's REALLY old - end with both a COBOL/VSAM and a COBOL/DB2 system which differs in only where the data lives. code, including copybooks, remains the same.
mostly the system does periodic (daily/weekly/<etc>) batch runs.
the VSAM order file has 99 line items. this gets converted to an Order table with 99 line_items.
beyond the need to fit the copybook definition, it is asserted that a two table (Order and Order_line) implementation will be too slow.
this rumination was motivated by reading an article linked to from an earlier thread, which discussed join implementation on 390. i suspect what it had to say applies generally. what caught me was, as i understood it, that nested table reads is most often used. if this is true, and it seems that hash joins are only more efficient on equality constraints, then is there a known analysis which at least mitigates the reading?
what (hypothetically ) we tend to do is put each table in a tablespace. for the Order/Order_line implementation, it seems logical to put them into one tablespace, cluster Order on Order_num and Order_line on Order_num, line_num; and buffer the hell out of it. similarly for indexes.
does this sound remotely on the right track?
Assuming that you never have an SQL statement result in a tablespace scan, it might be OK to put the Order/Order_line in one simple tablespace. But if a tablespace scan does occur, DB2 will (unnecessarily) scan both tables, when it could have just scanned the table needed if they were in separate tablespaces.
If you use a segmented tablespace, this will not help you since the data for two tables in a single segmented tablespace will not be on the same page (the lines items and the associated order), they will reside in different segments.
I think you are over designing just a bit. By using DB2 buffer pools effectively (this is the biggest difference between how DB2 and plain VSAM works), I don't think you need to put the two tables in the same tablespace.
Have one bufferpool for the catalog, indexes, and small tables that frequently accessed. Second bufferpool for medium and large tables. Third bufferpool for large decision support tables (if you have any in your application).
The speed of a join has nothing to do with whether the tables are using the same tablespace. The speed is related to whether the required data page is already in the bufferpool, or if it needs to be fetched from disk.
my working assumption was that, given the amount of data, that buffers
would exhaust during this (essentially) sequential batch update process,
and that emulating, to the extent possible, contiguous data storage would
help. may be not. but i do agree, that bufferpools of adequate size
are more important than most anything.
In article <da************ **************@ posting.google. com>,
robert <gn*****@rcn.co m> wrote: m0****@yahoo.co m (Mark) wrote in message news:<a5******* *************** ****@posting.go ogle.com>... gn*****@rcn.com (robert) wrote in message news:<da******* *************** ****@posting.go ogle.com>... consider the following hypothetical (it isn't but....)
- start with a COBOL/VSAM codebase that's REALLY old - end with both a COBOL/VSAM and a COBOL/DB2 system which differs in only where the data lives. code, including copybooks, remains the same.
mostly the system does periodic (daily/weekly/<etc>) batch runs.
the VSAM order file has 99 line items. this gets converted to an Order table with 99 line_items.
beyond the need to fit the copybook definition, it is asserted that a two table (Order and Order_line) implementation will be too slow.
this rumination was motivated by reading an article linked to from an earlier thread, which discussed join implementation on 390. i suspect what it had to say applies generally. what caught me was, as i understood it, that nested table reads is most often used. if this is true, and it seems that hash joins are only more efficient on equality constraints, then is there a known analysis which at least mitigates the reading?
what (hypothetically ) we tend to do is put each table in a tablespace. for the Order/Order_line implementation, it seems logical to put them into one tablespace, cluster Order on Order_num and Order_line on Order_num, line_num; and buffer the hell out of it. similarly for indexes.
does this sound remotely on the right track? Assuming that you never have an SQL statement result in a tablespace scan, it might be OK to put the Order/Order_line in one simple tablespace. But if a tablespace scan does occur, DB2 will (unnecessarily) scan both tables, when it could have just scanned the table needed if they were in separate tablespaces.
If you use a segmented tablespace, this will not help you since the data for two tables in a single segmented tablespace will not be on the same page (the lines items and the associated order), they will reside in different segments.
I think you are over designing just a bit. By using DB2 buffer pools effectively (this is the biggest difference between how DB2 and plain VSAM works), I don't think you need to put the two tables in the same tablespace.
Have one bufferpool for the catalog, indexes, and small tables that frequently accessed. Second bufferpool for medium and large tables. Third bufferpool for large decision support tables (if you have any in your application).
The speed of a join has nothing to do with whether the tables are using the same tablespace. The speed is related to whether the required data page is already in the bufferpool, or if it needs to be fetched from disk.
my working assumption was that, given the amount of data, that buffers would exhaust during this (essentially) sequential batch update process, and that emulating, to the extent possible, contiguous data storage would help. may be not. but i do agree, that bufferpools of adequate size are more important than most anything.
IF you are saying that the batch process will read each of two tables
sequentially, in the order that they will be physically held within DB2
(bearing in mind clustering sequence, free space, insert pattern, rows
out of sequence, SQL code, runstats, program bind, etc)
THEN I would expect that DB2 will automatically invoke sequential
pre-fetch. This means that DB2 will try to read the next pages into the
bufffer pool asynchronously BEFORE the program gets there, and updated
pages will be written out to physical disc, again asynchronous with the
program processing.
This effectively (ie simplified) means there are 3 separate CPU tasks -
one read, one update, one write, and the update task is only reading and
writing data in the buffer pools. This can be very efficient.
.... but if the access is random, then all performance bets are off, and
you have to be much more careful!
Hence why it is critical to have an understanding how data will be used
before the database is designed, at least for high-performance systems.
Martin
--
Martin Avison
Note that emails to News@ will be junked. Use Martin instead of News This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: Brian Patterson |
last post by:
I have noticed in the book of words that hasattr works by calling getattr
and raising an exception if no such attribute exists. If I need the value
in any case, am I better off using getattr within a try statement myself, or
is there some clever implementation enhancement which makes this a bad idea?
i.e. should I prefer:
if...
|
by: serge |
last post by:
I have an SP that is big, huge, 700-800 lines.
I am not an expert but I need to figure out every possible way that
I can improve the performance speed of this SP.
In the next couple of weeks I will work on preparing SQL statements
that will create the tables, insert sample record and run the SP.
I would hope people will look at my SP and...
|
by: teedilo |
last post by:
We have an application with a SQL Server 2000 back end that is fairly
database intensive -- lots of fairly frequent queries, inserts, updates
-- the gamut. The application does not make use of performance hogs
like cursors, but I know there are lots of ways the application could
be made more efficient database-wise. The server code is...
|
by: robert |
last post by:
typed this into the ibm.com search window, but didn't get anything
that looked like it would answer a question:
+trigger +faster +db2 +cobol
the question: are DB2 (390/v6, at the moment) triggers better
or worse (and how much so, of course) than the same functionality
coded in COBOL? assuming, of course, equal competence in the code.
|
by: simon |
last post by:
Hi
Hopefully someone could point me in the right direction on this one.
INFRASTRUCTURE
DB2 v7 on MVS OS390
SCENARIO
We are currently loading large volumes of data (eg 20MM rows) per day
into a partitioned DB2 database. In the installed version of DB2
there are only upto 240 (or so) partitions, so we have to double up
| |
by: bjarne |
last post by:
Willy Denoyette wrote;
> ... it
> was not the intention of StrousTrup to the achieve the level of efficiency
> of C when he invented C++, ...
Ahmmm. It was my aim to match the performance of C and I achieved that
aim very early on. See, for example "The Design and Evolution of C++".
-- Bjarne Stroustrup;...
|
by: Gee |
last post by:
To migrate the VSAM files to SQL server 2000 I need to load 'OLE DB
provider for AS/400 and VSAM'. To load the provider I've to install
Host Integration Server. But the cost is too high.
Please anyone let me know, how to migrate the VSAM files to SQL server
2000 without HIS? or is there any freeware available to migrate the
VSAM files?
...
|
by: jvn |
last post by:
I am experiencing a particular problem with performance counters.
I have created a set of classes, that uses
System.Diagnostics.PerformanceCounter to increment custom performance
counters (using .Net 2.0)
The performance counter categories have been successfully created.
When the set of classes are used by a WinForm test harness application,...
|
by: sajithamol |
last post by:
The requirement of a program is to read a flat file containing 1,000,000 Transaction ID sequentially and hit a KSDS Transaction VSAM to get the transaction details for each transaction ID. The KEY of VSAM is Transaction ID.How to proceed this?
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it.
First, let's disable language...
| |
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
| |
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...
| |