High Performance DPF INSERT's

wombat53

Hi Group
Are there any DB2 UDB ESE DPF V8.2 users exploiting "buffered inserts"
(BIND parm INSERT BUF) *and* "multi-row INSERTS" (many rows associated
with the VALUES clause of the INSERT to minimize number of calls to the
RDMBS engine) in an SQL PL Stored Procedure?
I ask, as the latter would imply embedded DYNAMIC SQL due to
potentially varying number of rows to be inserted (assume it's
varying), and so, the SQL INSERT would have be built dynamically; it's
not clear that DYNAMIC SQL and Buffered Inserts are compatible. i.e.
that one reaps either or both the benefits of these two
Hi-Performance-oriented features. It's quite possible they are
incomaptible, and I would have to abandon one or the other (multi-row
probably...)
Thanks
wombat53

Jul 12 '06 #1

Subscribe Post Reply

3664

Serge Rielau

wombat53 wrote:

Hi Group
Are there any DB2 UDB ESE DPF V8.2 users exploiting "buffered inserts"
(BIND parm INSERT BUF) *and* "multi-row INSERTS" (many rows associated
with the VALUES clause of the INSERT to minimize number of calls to the
RDMBS engine) in an SQL PL Stored Procedure?
I ask, as the latter would imply embedded DYNAMIC SQL due to
potentially varying number of rows to be inserted (assume it's
varying), and so, the SQL INSERT would have be built dynamically; it's
not clear that DYNAMIC SQL and Buffered Inserts are compatible. i.e.
that one reaps either or both the benefits of these two
Hi-Performance-oriented features. It's quite possible they are
incomaptible, and I would have to abandon one or the other (multi-row
probably...)

Buffered insert works only if the insert follow each other immediately
_without_any_SQL_in_between_.
In a loop in an SQL Procedure it's rather unlikely that that is the
case. The loop control will likely cause SQL execution unless it's trivial.

If you want to speed up mass inserts in a DPF system keep in mind that
you are bottlenecking on the SQL Procedure logic itself (running on the
coordinator).
You may get big benefits from parallelizing the procedure. That is CALL
the proc on each data-node (or at least each available coordinator if
you have multiple) and have it process a subset of the data (preferably
local data).
I have seen for a computational heavy batch process reding from a
staging table linear scalability for 2 concurrent procedure calls per
data node.
Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

IOD Conference
http://www.ibm.com/software/data/ond...ness/conf2006/

Jul 13 '06 #2

wombat53

Serge Rielau wrote:

wombat53 wrote:
Hi Group
Are there any DB2 UDB ESE DPF V8.2 users exploiting "buffered inserts"
(BIND parm INSERT BUF) *and* "multi-row INSERTS" (many rows associated
with the VALUES clause of the INSERT to minimize number of calls to the
RDMBS engine) in an SQL PL Stored Procedure?
I ask, as the latter would imply embedded DYNAMIC SQL due to
potentially varying number of rows to be inserted (assume it's
varying), and so, the SQL INSERT would have be built dynamically; it's
not clear that DYNAMIC SQL and Buffered Inserts are compatible. i.e.
that one reaps either or both the benefits of these two
Hi-Performance-oriented features. It's quite possible they are
incomaptible, and I would have to abandon one or the other (multi-row
probably...)
Buffered insert works only if the insert follow each other immediately
_without_any_SQL_in_between_.
In a loop in an SQL Procedure it's rather unlikely that that is the
case. The loop control will likely cause SQL execution unless it's trivial.

If you want to speed up mass inserts in a DPF system keep in mind that
you are bottlenecking on the SQL Procedure logic itself (running on the
coordinator).
You may get big benefits from parallelizing the procedure. That is CALL
the proc on each data-node (or at least each available coordinator if
you have multiple) and have it process a subset of the data (preferably
local data).
I have seen for a computational heavy batch process reding from a
staging table linear scalability for 2 concurrent procedure calls per
data node.
Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

IOD Conference
http://www.ibm.com/software/data/ond...ness/conf2006/

Thanks Serge.
We have decided to bypass the problems of commingling DYNAMIC SQL and
Buffered Inserts, and replace the SQL with STATIC. We will use CASE
expression to "jump" to the appropriate statically bound INSERT, for
however many rows we have available to us from out Message Queue (that
will be a parm, and taken care of within the CASE expr. within tyhe
STATIC INSERT). We are also mindful of the Special Consideratrions for
using Buffered Inserts abot as you have noted and as documented in ch.
31 of V8.2 "Prog. Client Applications".
I am following up on your comment "That is CALL

the proc on each data-node (or at least each available coordinator if
you have multiple) and have it process a subset of the data (preferably
local data)."

Is this something along the lines of intelligently identifying the
optimal co-ordinator node for SQL (INSERT) throught the two api's of
sqlugrpn - Get Row Partitioning Number, and
sqlugtpi - Get Table Partitioning Information (called only once) such
that co-ordinatoer node and data node are one, with co-ordinator
distributed across servers, and at the same thereby minimizing data
movement; or are you getting at something different, when you say
parallelizing the procedure? Perhaps sorting the messaged input in some
way, multi-threading the INSERTS, or simply having multiple
co-ordinator nodes, along then lines of the BCU/BPU methodology?etc..
We expect to be running ESE(DPF)/LINUX V9, most recent Beta, or GA of
July 28. We are looking at many million if SQL INSERT's/day, a
potential choke-point.
Thanks
wombat53

Jul 19 '06 #3

Similar topics

High performance replication ?

by: Andras Kovacs | last post by:

We have a performance problem to replicate our environnement. Our java code is able to insert 100 000 rows in a table within 3 seconds using Batch Statement. For two oracle sites it takes 6...

Oracle Database

Improving innodb performance

by: Andy Tran | last post by:

I built a system using mysql innodb to archive SMS messages but the innodb databases are not keeping up with the number of SMS messages coming in. I'm looking for performance of 200 msgs/sec where...

MySQL Database

a question on "High performance SQL"

by: Jean-Marc Blaise | last post by:

Hi Serge, I have a question on your exciting recent article on High Perf. SQL: why do we have to describe datatype for include columns in an INSERT .... INCLUDE .... statement - in your example,...

DB2 Database

DB2 Linux and High IO Usage

by: Michel Esber | last post by:

Hello, I have a Linux box (RH 4 Update 2 / kernel 2.6) running DB2 V8 FP 10. My application is continuosly inserting data to a table. The statement is a simple "insert into table values (a,b),...

DB2 Database

performance of query for normalized DB

by: phlype.johnson | last post by:

I'm struggling to find the best query from performance point of view and readability for a normalized DB design. To illustrate better my question on whether normalized designs lead to more complex...

MySQL Database

high performance/threaded applications in Python - your experiences?

by: Jay Loden | last post by:

All, In studying Python, I have predictably run across quite a bit of talk about the GIL and threading in Python. As my day job, I work with a (mostly Java) application that is heavily threaded....

Python

Impact Of Nologging Operations In High Availability Environments

by: Vinod Sadanandan | last post by:

IMPACT OF NOLOGGING OPERATIONS IN HIGH AVAILABILITY ENVIRONMENTS Logged operation has to generate redo for every change data or undo block, nologging operations indicate that the...

Oracle Database

deadlock and high cpu - chicken or the egg

by: Diggla | last post by:

I was asked to look into a performance problem on a newly migrated DB server. The db server was moved from a local-physical-nt4-sybase to remote (10 mb wan link), virtual, Windows 2003, SQL...

Microsoft SQL Server

High performance multi-threaded server

by: dotnetrocks | last post by:

Hi, I'm writing a high performance tcp/ip server using IOCP. Recently I found XF.Server component at http://www.kodart.com They claim that it is the fastest server implementation. Is it possible?...

.NET Framework

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice