sign in | join about | help | sitemap
Connecting Tech Pros Worldwide
Henry J.'s Avatar

question: multiple row insert vs. merge


Question posted by: Henry J. (Guest) on September 2nd, 2008 08:45 PM
My app needs to insert thousand value rows into a mostly empty table
(data are read from a file). I can either use inserts, or use merge.
The advantage of using merge is that in the few cases where the table
is not empty, it can take care of the updating part, which makes the
app cleaner.

However, my concern is the merge state would slow dowm the insertion
of new data, since in most cases the table is empty.

So my questions (before I run any comprehensive tests) are:

1. would merge slower than insert if it is an empty table?
2. given that we can do multiple row value inserts, would insert be
even faster than merge with which there is no multipe row option?

Thanks a lot!
24 Answers Posted
Serge Rielau's Avatar
Guest - n/a Posts
#2: Re: question: multiple row insert vs. merge

Henry J. wrote:
Quote:
Originally Posted by
My app needs to insert thousand value rows into a mostly empty table
(data are read from a file). I can either use inserts, or use merge.
The advantage of using merge is that in the few cases where the table
is not empty, it can take care of the updating part, which makes the
app cleaner.
>
However, my concern is the merge state would slow dowm the insertion
of new data, since in most cases the table is empty.
>
So my questions (before I run any comprehensive tests) are:
>
1. would merge slower than insert if it is an empty table?
2. given that we can do multiple row value inserts, would insert be
even faster than merge with which there is no multipe row option?

Is this DB2 for zOS? Which version?
Keep in mind that in DB2 for zOS supports NOT ATOMIC "array merge".
And of course DB2 for LUW support just regular MERGE with a VALUES
clause than can have multiple rows (or be based on a query)

Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
Henry J.'s Avatar
Guest - n/a Posts
#3: Re: question: multiple row insert vs. merge

On Sep 2, 4:59*pm, Serge Rielau <srie...@ca.ibm.comwrote:
Quote:
Originally Posted by
Is this DB2 for zOS? Which version?
Keep in mind that in DB2 for zOS supports NOT ATOMIC "array merge".
And of course DB2 for LUW support just regular MERGE with a VALUES
clause than can have multiple rows (or be based on a query)


The DB2 version is DB2/LINUXX8664 9.1.4.
Serge Rielau's Avatar
Guest - n/a Posts
#4: Re: question: multiple row insert vs. merge

Henry J. wrote:
Quote:
Originally Posted by
On Sep 2, 4:59 pm, Serge Rielau <srie...@ca.ibm.comwrote:
>
Quote:
Originally Posted by
>Is this DB2 for zOS? Which version?
>Keep in mind that in DB2 for zOS supports NOT ATOMIC "array merge".
>And of course DB2 for LUW support just regular MERGE with a VALUES
>clause than can have multiple rows (or be based on a query)

>
The DB2 version is DB2/LINUXX8664 9.1.4.

OK, then I don't understand your comment on multi row MERGE.
Also if the target table is empty The MERGE will be quick to decide that
no row matches. So I'm not worried about performance.

When in doubt compare the plans (and test).

Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
Henry J.'s Avatar
Guest - n/a Posts
#5: Re: question: multiple row insert vs. merge

On Sep 2, 6:28*pm, Serge Rielau <srie...@ca.ibm.comwrote:
Quote:
Originally Posted by
Henry J. wrote:
Quote:
Originally Posted by
On Sep 2, 4:59 pm, Serge Rielau <srie...@ca.ibm.comwrote:

>
Quote:
Originally Posted by
Quote:
Originally Posted by
Is this DB2 for zOS? Which version?
Keep in mind that in DB2 for zOS supports NOT ATOMIC "array merge".
And of course DB2 for LUW support just regular MERGE with a VALUES
clause than can have multiple rows (or be based on a query)

>
Quote:
Originally Posted by
The DB2 version is DB2/LINUXX8664 9.1.4.

>
OK, then I don't understand your comment on multi row MERGE.
Also if the target table is empty The MERGE will be quick to decide that
no row matches. So I'm not worried about performance.
>
When in doubt compare the plans (and test).
>
Cheers
Serge
>
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab


Thanks Serge. Here are two notes:

1). The app is written in C++ using embeded SQL with parameter
binding. Depending on how "insert into table values(?,?,?)" is
implemented, I guess it is possible for C++ to figure out it should
use "insert into table values(?,?,?) values (?,?,?) ....
values(?,?,?)" which could be faster?

2) Even though the table is empty at first, since the app does the
value inserts in batches, the table will gradually be filled up and it
would take more and more time to find there is no match before the
insertion can be actually done?

Thanks!
Serge Rielau's Avatar
Guest - n/a Posts
#6: Re: question: multiple row insert vs. merge

Henry J. wrote:
Quote:
Originally Posted by
On Sep 2, 6:28 pm, Serge Rielau <srie...@ca.ibm.comwrote:
Quote:
Originally Posted by
>Henry J. wrote:
Quote:
Originally Posted by
>>On Sep 2, 4:59 pm, Serge Rielau <srie...@ca.ibm.comwrote:
>>>Is this DB2 for zOS? Which version?
>>>Keep in mind that in DB2 for zOS supports NOT ATOMIC "array merge".
>>>And of course DB2 for LUW support just regular MERGE with a VALUES
>>>clause than can have multiple rows (or be based on a query)
>>The DB2 version is DB2/LINUXX8664 9.1.4.

>OK, then I don't understand your comment on multi row MERGE.
>Also if the target table is empty The MERGE will be quick to decide that
>no row matches. So I'm not worried about performance.
>>
>When in doubt compare the plans (and test).
>>
>Cheers
>Serge
>>
>--
>Serge Rielau
>DB2 Solutions Development
>IBM Toronto Lab

>
Thanks Serge. Here are two notes:
>
1). The app is written in C++ using embeded SQL with parameter
binding. Depending on how "insert into table values(?,?,?)" is
implemented, I guess it is possible for C++ to figure out it should
use "insert into table values(?,?,?) values (?,?,?) ....
values(?,?,?)" which could be faster?

VALUES (?, ?, ?), (?, ?, ?), .... is the correct syntax
Yes that would be faster than individual inserts.
Quote:
Originally Posted by
2) Even though the table is empty at first, since the app does the
value inserts in batches, the table will gradually be filled up and it
would take more and more time to find there is no match before the
insertion can be actually done?

I do assume that you have an index....


--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
w.l.fischer@googlemail.com's Avatar
w.l.fischer@googlemail.com September 4th, 2008 12:15 PM
Guest - n/a Posts
#7: Re: question: multiple row insert vs. merge

VALUES (?, ?, ?), (?, ?, ?), .... is the correct syntax

Imagine I have to insert n rows, and n is a different number every
time.

What is the recommended method from an embedded program/ODBC/JDBC (I
guess the same applies to all three):

- insert into (c1,c2) values (?, ?) (n statements with one row each)
- insert into (c1,c2) values (?, ?), (?, ?), (?, ?), ... (1 statement
with n rows)
- insert into (c1, c2) values (?, ?), (?, ?), ... (split the n rows
into n/100 inserts, and do the remainder in single row inserts)?

I think this answer would be an ideal pick for an FAQ as it seems to
be asked again and again... Is there such an FAQ for c.d.ibm-db2?

Did someone try it out performance-wise? My guess is the last one
would be the fastest if such inserts occur very often (no recompile of
the statement), and the second one might be faster if the insert occur
only once in a while.

Serge Rielau's Avatar
Guest - n/a Posts
#8: Re: question: multiple row insert vs. merge

Join Bytes! wrote:
Quote:
Originally Posted by
Quote:
Originally Posted by
>VALUES (?, ?, ?), (?, ?, ?), .... is the correct syntax

>
Imagine I have to insert n rows, and n is a different number every
time.
>
What is the recommended method from an embedded program/ODBC/JDBC (I
guess the same applies to all three):
>
- insert into (c1,c2) values (?, ?) (n statements with one row each)
- insert into (c1,c2) values (?, ?), (?, ?), (?, ?), ... (1 statement
with n rows)
- insert into (c1, c2) values (?, ?), (?, ?), ... (split the n rows
into n/100 inserts, and do the remainder in single row inserts)?
>
I think this answer would be an ideal pick for an FAQ as it seems to
be asked again and again... Is there such an FAQ for c.d.ibm-db2?

Google for "SQL on Fire". I describe the same technique there.
Now, some Db2 clients (CLI/ODBC for sure) JDBC I don't know do support
array or buffered insert.
In that case the client gathers up all the insert variables and sends
them to the server in one shot. The server then executes a single row
insert in a tight loop.

Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
w.l.fischer@googlemail.com's Avatar
w.l.fischer@googlemail.com September 4th, 2008 02:05 PM
Guest - n/a Posts
#9: Re: question: multiple row insert vs. merge

Google for "SQL on Fire". I describe the same technique there.

Found it :-)

Quote:
"Tip:

For mass inserts prepare INSERT statements with:
1, 10, 100, 1000, 10000 rows of parameter markers each.
Execute the biggest that fits the remaining load in a loop."

Thanks!
Henry J.'s Avatar
Guest - n/a Posts
#10: Re: question: multiple row insert vs. merge

On Sep 4, 7:15 am, Serge Rielau <srie...@ca.ibm.comwrote:
Quote:
Originally Posted by
w.l.fisc...@googlemail.com wrote:
Quote:
Originally Posted by
Quote:
Originally Posted by
VALUES (?, ?, ?), (?, ?, ?), .... is the correct syntax

>
Quote:
Originally Posted by
Imagine I have to insert n rows, and n is a different number every
time.

>
Quote:
Originally Posted by
What is the recommended method from an embedded program/ODBC/JDBC (I
guess the same applies to all three):

>
Quote:
Originally Posted by
- insert into (c1,c2) values (?, ?) (n statements with one row each)
- insert into (c1,c2) values (?, ?), (?, ?), (?, ?), ... (1 statement
with n rows)
- insert into (c1, c2) values (?, ?), (?, ?), ... (split the n rows
into n/100 inserts, and do the remainder in single row inserts)?

>
Quote:
Originally Posted by
I think this answer would be an ideal pick for an FAQ as it seems to
be asked again and again... Is there such an FAQ for c.d.ibm-db2?

>
Google for "SQL on Fire". I describe the same technique there.
Now, some Db2 clients (CLI/ODBC for sure) JDBC I don't know do support
array or buffered insert.
In that case the client gathers up all the insert variables and sends
them to the server in one shot. The server then executes a single row
insert in a tight loop.
>


If I send a bunch of single-row inserts like in "insert into t
values(?,?) insert into t values(?,?)
.....insert into t values(?,?) " via CLI/ODBC to DB2 in one *batch* ,
will CLI/ODBC know
to convert it to "insert into t values(?,?) (?,?) ....(?,?) " before
executing it?

Serge Rielau's Avatar
Guest - n/a Posts
#11: Re: question: multiple row insert vs. merge

Henry J. wrote:
Quote:
Originally Posted by
On Sep 4, 7:15 am, Serge Rielau <srie...@ca.ibm.comwrote:
Quote:
Originally Posted by
>w.l.fisc...@googlemail.com wrote:
Quote:
Originally Posted by
>>>VALUES (?, ?, ?), (?, ?, ?), .... is the correct syntax
>>Imagine I have to insert n rows, and n is a different number every
>>time.
>>What is the recommended method from an embedded program/ODBC/JDBC (I
>>guess the same applies to all three):
>>- insert into (c1,c2) values (?, ?) (n statements with one row each)
>>- insert into (c1,c2) values (?, ?), (?, ?), (?, ?), ... (1 statement
>>with n rows)
>>- insert into (c1, c2) values (?, ?), (?, ?), ... (split the n rows
>>into n/100 inserts, and do the remainder in single row inserts)?
>>I think this answer would be an ideal pick for an FAQ as it seems to
>>be asked again and again... Is there such an FAQ for c.d.ibm-db2?

>Google for "SQL on Fire". I describe the same technique there.
>Now, some Db2 clients (CLI/ODBC for sure) JDBC I don't know do support
>array or buffered insert.
>In that case the client gathers up all the insert variables and sends
>them to the server in one shot. The server then executes a single row
>insert in a tight loop.
>>

>
If I send a bunch of single-row inserts like in "insert into t
values(?,?) insert into t values(?,?)
....insert into t values(?,?) " via CLI/ODBC to DB2 in one *batch* ,
will CLI/ODBC know
to convert it to "insert into t values(?,?) (?,?) ....(?,?) " before
executing it?
>

No. CLI/ODBC will shred the statements and send them one by one.

Cheers
Serge


--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
Dan van Ginhoven's Avatar
Dan van Ginhoven September 6th, 2008 08:25 AM
Guest - n/a Posts
#12: Re: question: multiple row insert vs. merge

Hi!

If you really want good performance, inserting rows into a table,
assuming your application can be designed this way:

1. Write the rows to be inserted or updated to a diskfile.
2. Use the load utility to load the file into a worktable.
3. Use the merge statement to update the target table from the worktable.

/dg


"Henry J." <tank209209@yahoo.comwrote in message news:a190116d-07d0-436f-aacb-996172dd5984@59g2000hsb.googlegroups.com...
Quote:
Originally Posted by
My app needs to insert thousand value rows into a mostly empty table
(data are read from a file). I can either use inserts, or use merge.
The advantage of using merge is that in the few cases where the table
is not empty, it can take care of the updating part, which makes the
app cleaner.
>
However, my concern is the merge state would slow dowm the insertion
of new data, since in most cases the table is empty.
>
So my questions (before I run any comprehensive tests) are:
>
1. would merge slower than insert if it is an empty table?
2. given that we can do multiple row value inserts, would insert be
even faster than merge with which there is no multipe row option?
>
Thanks a lot!



Henry J.'s Avatar
Guest - n/a Posts
#13: Re: question: multiple row insert vs. merge

On Sep 5, 7:23 am, Serge Rielau <srie...@ca.ibm.comwrote:
Quote:
Originally Posted by
Henry J. wrote:
Quote:
Originally Posted by
On Sep 4, 7:15 am, Serge Rielau <srie...@ca.ibm.comwrote:
Quote:
Originally Posted by
w.l.fisc...@googlemail.com wrote:
>>VALUES (?, ?, ?), (?, ?, ?), .... is the correct syntax
>Imagine I have to insert n rows, and n is a different number every
>time.
>What is the recommended method from an embedded program/ODBC/JDBC (I
>guess the same applies to all three):
>- insert into (c1,c2) values (?, ?) (n statements with one row each)
>- insert into (c1,c2) values (?, ?), (?, ?), (?, ?), ... (1 statement
>with n rows)
>- insert into (c1, c2) values (?, ?), (?, ?), ... (split the n rows
>into n/100 inserts, and do the remainder in single row inserts)?
>I think this answer would be an ideal pick for an FAQ as it seems to
>be asked again and again... Is there such an FAQ for c.d.ibm-db2?
Google for "SQL on Fire". I describe the same technique there.
Now, some Db2 clients (CLI/ODBC for sure) JDBC I don't know do support
array or buffered insert.
In that case the client gathers up all the insert variables and sends
them to the server in one shot. The server then executes a single row
insert in a tight loop.

>
Quote:
Originally Posted by
If I send a bunch of single-row inserts like in "insert into t
values(?,?) insert into t values(?,?)
....insert into t values(?,?) " via CLI/ODBC to DB2 in one *batch* ,
will CLI/ODBC know
to convert it to "insert into t values(?,?) (?,?) ....(?,?) " before
executing it?

>
No. CLI/ODBC will shred the statements and send them one by one.
>
Cheers
Serge
>
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab


Good. Another question is, if I send a list of inserts each with two
value tuples, like this:
"insert into t values(?,?) (?,?)", instead of a list of "insert into t
values(?,?)",would it speed up the insertion? Thanks a lot! This is
really helpful, Serge.
Henry J.'s Avatar
Guest - n/a Posts
#14: Re: question: multiple row insert vs. merge

On Sep 6, 3:22 am, "Dan van Ginhoven" <danfa...@hotmail.comwrote:
Quote:
Originally Posted by
Hi!
>
If you really want good performance, inserting rows into a table,
assuming your application can be designed this way:
>
1. Write the rows to be inserted or updated to a diskfile.
2. Use the load utility to load the file into a worktable.
3. Use the merge statement to update the target table from the worktable.
>
/dg


Good tip. Questions:

1) what kind of load utility is best for the method you mentioned? In
Sybase, bcp is very good at this. I haven't heard of its counterpart
in DB2 yet.
2) Instead of writing to a file and then uploading it into a
worktable, how about inserting into a worktable directly? Will it be
slower than your method? or would it still be faster than inserting
into the target table directly? Note that it would be much simpler to
implement.

Thanks!

Serge Rielau's Avatar
Guest - n/a Posts
#15: Re: question: multiple row insert vs. merge

Henry J. wrote:
Quote:
Originally Posted by
On Sep 5, 7:23 am, Serge Rielau <srie...@ca.ibm.comwrote:
Quote:
Originally Posted by
>Henry J. wrote:
Quote:
Originally Posted by
>>On Sep 4, 7:15 am, Serge Rielau <srie...@ca.ibm.comwrote:
>>>w.l.fisc...@googlemail.com wrote:
>>>>>VALUES (?, ?, ?), (?, ?, ?), .... is the correct syntax
>>>>Imagine I have to insert n rows, and n is a different number every
>>>>time.
>>>>What is the recommended method from an embedded program/ODBC/JDBC (I
>>>>guess the same applies to all three):
>>>>- insert into (c1,c2) values (?, ?) (n statements with one row each)
>>>>- insert into (c1,c2) values (?, ?), (?, ?), (?, ?), ... (1 statement
>>>>with n rows)
>>>>- insert into (c1, c2) values (?, ?), (?, ?), ... (split the n rows
>>>>into n/100 inserts, and do the remainder in single row inserts)?
>>>>I think this answer would be an ideal pick for an FAQ as it seems to
>>>>be asked again and again... Is there such an FAQ for c.d.ibm-db2?
>>>Google for "SQL on Fire". I describe the same technique there.
>>>Now, some Db2 clients (CLI/ODBC for sure) JDBC I don't know do support
>>>array or buffered insert.
>>>In that case the client gathers up all the insert variables and sends
>>>them to the server in one shot. The server then executes a single row
>>>insert in a tight loop.
>>If I send a bunch of single-row inserts like in "insert into t
>>values(?,?) insert into t values(?,?)
>>....insert into t values(?,?) " via CLI/ODBC to DB2 in one *batch* ,
>>will CLI/ODBC know
>>to convert it to "insert into t values(?,?) (?,?) ....(?,?) " before
>>executing it?

>No. CLI/ODBC will shred the statements and send them one by one.
>>
>Cheers
>Serge
>>
>--
>Serge Rielau
>DB2 Solutions Development
>IBM Toronto Lab

>
Good. Another question is, if I send a list of inserts each with two
value tuples, like this:
"insert into t values(?,?) (?,?)", instead of a list of "insert into t
values(?,?)",would it speed up the insertion? Thanks a lot! This is
really helpful, Serge.

Yes. I did some measurements a while ago (very unscientific) and found
that it makes sense to bundle up to 100 rows. After that it doesn't make
a difference. Of course network latency plays a role (my client was local).

Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
w.l.fischer@googlemail.com's Avatar
w.l.fischer@googlemail.com September 9th, 2008 07:35 PM
Guest - n/a Posts
#16: Re: question: multiple row insert vs. merge

1) what kind of load utility is best for the method you mentioned? *In
Quote:
Originally Posted by
Sybase, bcp is very good at this. *I haven't heard of its counterpart
in DB2 yet.


Search the documentation for the "load" command. This example loads
from a file in csv format:
load from file.csv of del replace into <tablenamenonrecoverable;
Quote:
Originally Posted by
2) Instead of writing to a file and then uploading it into a
worktable, how about inserting into a worktable directly? *Will it be
slower than your method? *or would it still be faster than inserting
into the target table directly? *Note that it would be much simpler to
implement.


I don't know but I I guess it will be slower instead of faster.
Dan van Ginhoven's Avatar
Dan van Ginhoven September 9th, 2008 08:45 PM
Guest - n/a Posts
#17: Re: question: multiple row insert vs. merge


"Henry J." <tank209209@yahoo.comwrote in message news:cb85c3b0-479a-48eb-ba18-4a4410c5e174@s50g2000hsb.googlegroups.com...
Quote:
Originally Posted by
On Sep 6, 3:22 am, "Dan van Ginhoven" <danfa...@hotmail.comwrote:
Quote:
Originally Posted by
Hi!

If you really want good performance, inserting rows into a table,
assuming your application can be designed this way:

1. Write the rows to be inserted or updated to a diskfile.
2. Use the load utility to load the file into a worktable.
3. Use the merge statement to update the target table from the worktable.

/dg

>
Good tip. Questions:
>
1) what kind of load utility is best for the method you mentioned? In
Sybase, bcp is very good at this. I haven't heard of its counterpart
in DB2 yet.

db2 load is as fast as bcp but the syntax is more complex.
Quote:
Originally Posted by
2) Instead of writing to a file and then uploading it into a
worktable, how about inserting into a worktable directly? Will it be
slower than your method? or would it still be faster than inserting
into the target table directly? Note that it would be much simpler to
implement.

It depends on the number of rows.
Using db2load means going out of db2, and access it from the command line.
db2load take some time to initialize. There is a break even somewhere
and I guess that it is above some thousand rows. So unless you are in a batch situation as opposed to online
I wouldn't try load.
/dg
Quote:
Originally Posted by
Thanks!
>



Henry J.'s Avatar
Guest - n/a Posts
#18: Re: question: multiple row insert vs. merge

On Sep 7, 2:48*pm, Serge Rielau <srie...@ca.ibm.comwrote:
Quote:
Originally Posted by
Quote:
Originally Posted by
Good. *Another question is, if I send a list of inserts each with two
value tuples, like this:
"insert into t values(?,?) (?,?)", instead of a list of "insert into t
values(?,?)",would it speed up the insertion? *Thanks a lot! *This is
really helpful, Serge.

>
Yes. I did some measurements a while ago (very unscientific) and found
that it makes sense to bundle up to 100 rows. After that it doesn't make
a difference. Of course network latency plays a role (my client was local).
>


Hope this is my last question in this thread. Now I know it is faster
to send a batch of 1000
"insert into t values(?,?) (?,?)" than sending 2000 "insert into t
values(?,?)". Is it equally fast if i send 1000 merge statements like
"merge into t using table (values(?,?) (?,?) ) ..."? Faster than 2000
"merge into t using table (values(?,?) ) ...", right?

I know I can find out by testing but tips from experts like you guys
save me tons of time ... Really appreciate it.

Thanks!
Serge Rielau's Avatar
Guest - n/a Posts
#19: Re: question: multiple row insert vs. merge

Henry J. wrote:
Quote:
Originally Posted by
On Sep 7, 2:48 pm, Serge Rielau <srie...@ca.ibm.comwrote:
Quote:
Originally Posted by
Quote:
Originally Posted by
>>Good. Another question is, if I send a list of inserts each with two
>>value tuples, like this:
>>"insert into t values(?,?) (?,?)", instead of a list of "insert into t
>>values(?,?)",would it speed up the insertion? Thanks a lot! This is
>>really helpful, Serge.

>Yes. I did some measurements a while ago (very unscientific) and found
>that it makes sense to bundle up to 100 rows. After that it doesn't make
>a difference. Of course network latency plays a role (my client was local).
>>

>
Hope this is my last question in this thread. Now I know it is faster
to send a batch of 1000
"insert into t values(?,?) (?,?)" than sending 2000 "insert into t
values(?,?)". Is it equally fast if i send 1000 merge statements like
"merge into t using table (values(?,?) (?,?) ) ..."? Faster than 2000
"merge into t using table (values(?,?) ) ...", right?
>
I know I can find out by testing but tips from experts like you guys
save me tons of time ... Really appreciate it.

MERGE is a special child....
If you use a MERGE straight with a multi-row VALUES DB2 does not know
whether the new rows match to unique rows in the MERGE target.
If DB2 can't rely on that the MERGE statement gets a lot more complex
(and slower). Here is how to do it right:

MERGE INTO T USING (SELECT pk, MAX(c1), MAX(c2) FROM (VALUES (?, ?, ?),
....) AS V(pk, c1, c2)) AS S(pk, c1, c2)
ON T.pk = S.pk
WHEN MATCHED THEN UPDATE SET c1 = S.c1, c2 = S.c2
WHEN NOT MATCHED THEN INSERT VALUES (pk, c1, c2)

Not that in a case of 200 single-row merge teh semantics are different.
You can absolutely have dups since teh MERGEs are commulatative (sp??).

Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
Henry J.'s Avatar
Guest - n/a Posts
#20: Re: question: multiple row insert vs. merge

On Sep 10, 10:27*am, Serge Rielau <srie...@ca.ibm.comwrote:
Quote:
Originally Posted by
Henry J. wrote:
Quote:
Originally Posted by
On Sep 7, 2:48 pm, Serge Rielau <srie...@ca.ibm.comwrote:
Quote:
Originally Posted by
>Good. *Another question is, if I send a list of inserts each with two
>value tuples, like this:
>"insert into t values(?,?) (?,?)", instead of a list of "insert into t
>values(?,?)",would it speed up the insertion? *Thanks a lot! *This is
>really helpful, Serge.
Yes. I did some measurements a while ago (very unscientific) and found
that it makes sense to bundle up to 100 rows. After that it doesn't make
a difference. Of course network latency plays a role (my client was local).

>
Quote:
Originally Posted by
Hope this is my last question in this thread. *Now I know it is faster
to send a batch of 1000
"insert into t values(?,?) (?,?)" than sending 2000 "insert into t
values(?,?)". *Is it equally fast if i send 1000 merge statements like
"merge into t using table (values(?,?) (?,?) ) ..."? *Faster than 2000
"merge into t using table (values(?,?) ) ...", right?

>
Quote:
Originally Posted by
I know I can find out by testing but tips from experts like you guys
save me tons of time ... *Really appreciate it.

>
MERGE is a special child....
If you use a MERGE straight with a multi-row VALUES DB2 does not know
whether the new rows match to unique rows in the MERGE target.
If DB2 can't rely on that the MERGE statement gets a lot more complex
(and slower). Here is how to do it right:
>
MERGE INTO T USING (SELECT pk, MAX(c1), MAX(c2) FROM (VALUES (?, ?, ?),
...) AS V(pk, c1, c2)) AS S(pk, c1, c2)
ON T.pk = S.pk
WHEN MATCHED THEN UPDATE SET c1 = S.c1, c2 = S.c2
WHEN NOT MATCHED THEN INSERT VALUES (pk, c1, c2)
>
Not that in a case of 200 single-row merge teh semantics are different.
You can absolutely have dups since teh MERGEs are commulatative (sp??).
>
Cheers
Serge
>
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab- Hide quoted text -
>
- Show quoted text -


Do you mean that we indeed can achieve similar performance boost using
merge? I'm not sure if I understand your trick with "select
max ...". Are you intended to do a "group by"? Can we use union to
remove dupe as in "values(?,?,?) union values(?,?,?) union .... "?

Thanks!
Serge Rielau's Avatar
Guest - n/a Posts
#21: Re: question: multiple row insert vs. merge

Henry J. wrote:
Quote:
Originally Posted by
On Sep 10, 10:27 am, Serge Rielau <srie...@ca.ibm.comwrote:
Quote:
Originally Posted by
>Henry J. wrote:
Quote:
Originally Posted by
>>On Sep 7, 2:48 pm, Serge Rielau <srie...@ca.ibm.comwrote:
>>>>Good. Another question is, if I send a list of inserts each with two
>>>>value tuples, like this:
>>>>"insert into t values(?,?) (?,?)", instead of a list of "insert into t
>>>>values(?,?)",would it speed up the insertion? Thanks a lot! This is
>>>>really helpful, Serge.
>>>Yes. I did some measurements a while ago (very unscientific) and found
>>>that it makes sense to bundle up to 100 rows. After that it doesn't make
>>>a difference. Of course network latency plays a role (my client was local).
>>Hope this is my last question in this thread. Now I know it is faster
>>to send a batch of 1000
>>"insert into t values(?,?) (?,?)" than sending 2000 "insert into t
>>values(?,?)". Is it equally fast if i send 1000 merge statements like
>>"merge into t using table (values(?,?) (?,?) ) ..."? Faster than 2000
>>"merge into t using table (values(?,?) ) ...", right?
>>I know I can find out by testing but tips from experts like you guys
>>save me tons of time ... Really appreciate it.

>MERGE is a special child....
>If you use a MERGE straight with a multi-row VALUES DB2 does not know
>whether the new rows match to unique rows in the MERGE target.
>If DB2 can't rely on that the MERGE statement gets a lot more complex
>(and slower). Here is how to do it right:
>>
>MERGE INTO T USING (SELECT pk, MAX(c1), MAX(c2) FROM (VALUES (?, ?, ?),
>...) AS V(pk, c1, c2)) AS S(pk, c1, c2)
>ON T.pk = S.pk
>WHEN MATCHED THEN UPDATE SET c1 = S.c1, c2 = S.c2
>WHEN NOT MATCHED THEN INSERT VALUES (pk, c1, c2)
>>
>Not that in a case of 200 single-row merge teh semantics are different.
>You can absolutely have dups since teh MERGEs are commulatative (sp??).
>>
>Cheers
>Serge
>>
>--
>Serge Rielau
>DB2 Solutions Development
>IBM Toronto Lab- Hide quoted text -
>>
>- Show quoted text -

>
Do you mean that we indeed can achieve similar performance boost using
merge? I'm not sure if I understand your trick with "select
max ...". Are you intended to do a "group by"? Can we use union to
remove dupe as in "values(?,?,?) union values(?,?,?) union .... "?

Yes, you need a GROUP BY pk. If you want to make it messier use UNION.
But the result is the same.

Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
Henry J.'s Avatar
Guest - n/a Posts
#22: Re: question: multiple row insert vs. merge

On Sep 17, 8:04*pm, Serge Rielau <srie...@ca.ibm.comwrote:
Quote:
Originally Posted by
HenryJ. wrote:
Quote:
Originally Posted by
On Sep 10, 10:27 am, Serge Rielau <srie...@ca.ibm.comwrote:
Quote:
Originally Posted by
>HenryJ. wrote:
>On Sep 7, 2:48 pm, Serge Rielau <srie...@ca.ibm.comwrote:
>>>Good. *Another question is, if I send a list of inserts each withtwo
>>>value tuples, like this:
>>>"insert into t values(?,?) (?,?)", instead of a list of "insert into t
>>>values(?,?)",would it speed up the insertion? *Thanks a lot! *This is
>>>really helpful, Serge.
>>Yes. I did some measurements a while ago (very unscientific) and found
>>that it makes sense to bundle up to 100 rows. After that it doesn't make
>>a difference. Of course network latency plays a role (my client was local).
>Hope this is my last question in this thread. *Now I know it is faster
>to send a batch of 1000
>"insert into t values(?,?) (?,?)" than sending 2000 "insert into t
>values(?,?)". *Is it equally fast if i send 1000 merge statements like
>"merge into t using table (values(?,?) (?,?) ) ..."? *Faster than 2000
>"merge into t using table (values(?,?) ) ...", right?
>I know I can find out by testing but tips from experts like you guys
>save me tons of time ... *Really appreciate it.
MERGE is a special child....
If you use a MERGE straight with a multi-row VALUES DB2 does not know
whether the new rows match to unique rows in the MERGE target.
If DB2 can't rely on that the MERGE statement gets a lot more complex
(and slower). Here is how to do it right:

>
Quote:
Originally Posted by
Quote:
Originally Posted by
MERGE INTO T USING (SELECT pk, MAX(c1), MAX(c2) FROM (VALUES (?, ?, ?),
...) AS V(pk, c1, c2)) AS S(pk, c1, c2)
ON T.pk = S.pk
WHEN MATCHED THEN UPDATE SET c1 = S.c1, c2 = S.c2
WHEN NOT MATCHED THEN INSERT VALUES (pk, c1, c2)

>
Quote:
Originally Posted by
Quote:
Originally Posted by
Not that in a case of 200 single-row merge teh semantics are different..
You can absolutely have dups since teh MERGEs are commulatative (sp??)..

>
Quote:
Originally Posted by
Quote:
Originally Posted by
Cheers
Serge

>
Quote:
Originally Posted by
Quote:
Originally Posted by
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab- Hide quoted text -

>
Quote:
Originally Posted by
Quote:
Originally Posted by
- Show quoted text -

>
Quote:
Originally Posted by
Do you mean that we indeed can achieve similar performance boost using
merge? * I'm not sure if I understand your trick with "select
max ...". *Are you intended to do a "group by"? * Can we use union to
remove dupe as in *"values(?,?,?) union values(?,?,?) union .... "?

>
Yes, you need a GROUP BY pk. If you want to make it messier use UNION.
But the result is the same.
>
Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab


So we need to use "group by" to let the optimizer know that there is
no dupe value?

How about "insert into t values (?,?)(?,?)"? It can also contain dupe
values?

Note that our app will never send dupe values to be inserted or
merged.

Thanks.
Serge Rielau's Avatar
Guest - n/a Posts
#23: Re: question: multiple row insert vs. merge

Henry J. wrote:
Quote:
Originally Posted by
Note that our app will never send dupe values to be inserted or
merged.

DB2 cannot know that. The only way for DB2 to know is GROUP BY the PK.
Note that UNION actually (I thought more about it) does NOT work.
UNION assures uniqueness of the row. That is not enough. You need
uniqueness of the key used in teh MERGe ON clause.

Cheers
Serge

PS: I think we have given you all the bread crumbs you need and I am on
my way to a beach. So run experimenets. Look at the optimizer plan, ...

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
Henry J.'s Avatar
Guest - n/a Posts
#24: Re: question: multiple row insert vs. merge

On Sep 18, 9:02*am, Serge Rielau <srie...@ca.ibm.comwrote:
Quote:
Originally Posted by
HenryJ. wrote:
Quote:
Originally Posted by
Note that our app will never send dupe values to be inserted or
merged.

>
DB2 cannot know that. The only way for DB2 to know is GROUP BY the PK.
Note that UNION actually (I thought more about it) does NOT work.
UNION assures uniqueness of the row. That is not enough. You need
uniqueness of the key used in teh MERGe ON clause.
>
Cheers
Serge
>
PS: I think we have given you all the bread crumbs you need and I am on
my way to a beach. So run experimenets. Look at the optimizer plan, ...
>
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab


Thanks Serge. Really appreciated. I got all I need to know. Have
you written any books? A true expert and kind soul. Wish you great
time on beach.
Serge Rielau's Avatar
Guest - n/a Posts
#25: Re: question: multiple row insert vs. merge

Henry J. wrote:
Quote:
Originally Posted by
Thanks Serge. Really appreciated. I got all I need to know. Have
you written any books? A true expert and kind soul. Wish you great
time on beach.

Hah! It takes a patient soul to write a book. I am many things, but not
that.

Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
 
Not the answer you were looking for? Post your question . . .
197,042 members ready to help you find a solution.
Join Bytes.com

What is Bytes?

We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights. Get the best answers to your questions from over 197,042 network members.
Post your question now . . .
It's fast and it's free

Popular Articles

Top Community Contributors