473,847 Members | 1,460 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Delete and concurrency problems

Hello,

Env: DB2 V8 LUW FP16 running Linux

create Table X (machine_id varchar(24) not null, ctime timestamp not
null);
create index iFoo on X (MACHINE_ID, CTIME) allow reverse scans;
alter table X add primary key (MACHINE_ID, CTIME);

Our C++ application inserts data into a table X using CLI array insert
with very good throughput.

On an hourly basis, another application (running on a different
server) reads the last 'hour':

select * from T where MACHINE_ID = ? and ctime between current
timestamp - 1 hour and current timestamp for read only;

After that, it starts a delete loop:

while (SQLCODE <100 )
do
delete from (select 1 from T where MACHINE_ID = ? and CTIME between
current timestamp - 1 hour and current timestamp fetch first 2000 rows
only) as B
done
The deletes have severe impact on my insert rate/throughput. I can see
number 3 times slower, or even worse. There are absolutely *NO* lock-
waits.

Question: Should I expect this performance degradation or am I missing
something ? Maybe there is a better technique to delete data.

Any thoughts ?

Thanks in advance,
Sep 18 '08 #1
8 2987
"Michel Esber" <mi****@us.auto matos.comwrote in message
news:d3******** *************** ***********@v39 g2000pro.google groups.com...
Hello,

Env: DB2 V8 LUW FP16 running Linux

create Table X (machine_id varchar(24) not null, ctime timestamp not
null);
create index iFoo on X (MACHINE_ID, CTIME) allow reverse scans;
alter table X add primary key (MACHINE_ID, CTIME);

Our C++ application inserts data into a table X using CLI array insert
with very good throughput.

On an hourly basis, another application (running on a different
server) reads the last 'hour':

select * from T where MACHINE_ID = ? and ctime between current
timestamp - 1 hour and current timestamp for read only;

After that, it starts a delete loop:

while (SQLCODE <100 )
do
delete from (select 1 from T where MACHINE_ID = ? and CTIME between
current timestamp - 1 hour and current timestamp fetch first 2000 rows
only) as B
done
The deletes have severe impact on my insert rate/throughput. I can see
number 3 times slower, or even worse. There are absolutely *NO* lock-
waits.

Question: Should I expect this performance degradation or am I missing
something ? Maybe there is a better technique to delete data.

Any thoughts ?

Thanks in advance,
db2set DB2_SKIPDELETED =ON
db2set DB2_SKIPINSERTE D=ON

then restart the instance (db2stop, db2start)
Sep 18 '08 #2
db2set DB2_SKIPDELETED =ON
db2set DB2_SKIPINSERTE D=ON

then restart the instance (db2stop, db2start)- Ocultar texto entre aspas -
Hi Mark,

Thanks for your feedback. However, this variable was already set.
Here is my config:

DB2LINUXAIO=TRU E
DB2_SKIPINSERTE D=ON
DB2_SCATTERED_I O=ON
DB2_USE_ALTERNA TE_PAGE_CLEANIN G=ON
DB2_EVALUNCOMMI TTED=ON
DB2_SKIPDELETED =ON
DB2COMM=tcpip
DB2_PARALLEL_IO =*
DB2AUTOSTART=NO

Maybe v9 has a better locking/concurrency mechanism ?

-M
Sep 19 '08 #3
I was reading Chris Eaton´s article:

http://it.toolbox.com/blogs/db2luw/s...statement-7239

And it seems like the use of select from delete may be useful here.

However, here is the problem:

My application has to read up to 500K rows, summarizes data, and then
it loops deleting data in chunks of 2k rows.
I do need to read ALL the rows within a range and only after that
start the deletion.

select * from OLD TABLE (delete from X where MACHINE_ID = ? and ctime
between ? and ? fetch first 2000 rows only) doesn´t seem to suit my
needs.

Any thoughts?

Thanks,
Sep 19 '08 #4
Michel Esber wrote:
I was reading Chris Eaton´s article:

http://it.toolbox.com/blogs/db2luw/s...statement-7239

And it seems like the use of select from delete may be useful here.

However, here is the problem:

My application has to read up to 500K rows, summarizes data, and then
it loops deleting data in chunks of 2k rows.
I do need to read ALL the rows within a range and only after that
start the deletion.

select * from OLD TABLE (delete from X where MACHINE_ID = ? and ctime
between ? and ? fetch first 2000 rows only) doesn´t seem to suit my
needs.
General question: How does teh plan look like? Are you deleting by index
or does DB2 scan?

Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
Sep 19 '08 #5
General question: How does teh plan look like? Are you deleting by index
or does DB2 scan?
Hello Serge,

It is deleting by index. Here is the full plan:

*************** ***** EXPLAIN INSTANCE *************** *****

DB2_VERSION: 08.02.8
SOURCE_NAME: SQLC2E07
SOURCE_SCHEMA: NULLID
SOURCE_VERSION:
EXPLAIN_TIME: 2008-09-19-11.05.35.744822
EXPLAIN_REQUEST ER: DB2INST1

Database Context:
----------------
Parallelism: None
CPU Speed: 4.802167e-07
Comm Speed: 0
Buffer Pool size: 189000
Sort Heap size: 2048
Database Heap size: 16000
Lock List size: 8192
Maximum Lock List: 20
Average Applications: 50
Locks Available: 167116

Package Context:
---------------
SQL Type: Dynamic
Optimization Level: 5
Blocking: Block All Cursors
Isolation Level: Cursor Stability

---------------- STATEMENT 1 SECTION 203 ----------------
QUERYNO: 1
QUERYTAG:
Statement Type: Searched Delete
Updatable: Not Applicable
Deletable: Not Applicable
Query Degree: 1

Original Statement:
------------------
DELETE
FROM RTM.TBL_COLLECT _PSSTAT_WIN_RTM
where MACHINE_ID = ? and collect_time between ? and ?
Optimized Statement:
-------------------
DELETE
FROM RTM.TBL_COLLECT _PSSTAT_WIN_RTM AS Q1
WHERE $RID$ IN
(SELECT $RID$
FROM RTM.TBL_COLLECT _PSSTAT_WIN_RTM AS Q2
WHERE (:? <= :?) AND (Q2.COLLECT_TIM E <= :?) AND (:? <=
Q2.COLLECT_TIME )
AND (Q2.MACHINE_ID = :?))

Access Plan:
-----------
Total Cost: 73.0298
Query Degree: 1

Rows
RETURN
( 1)
Cost
I/O
|
8.39196
DELETE
( 2)
73.0298
11.392
/------+------\
8.39196 2.918e+06
IXSCAN TABLE: RTM
( 3) TBL_COLLECT_PSS TAT_WIN_RTM
19.2689
3
|
2.918e+06
INDEX: DB2INST1
IX_PS_WIN


Extended Diagnostic Information:
--------------------------------

No extended Diagnostic Information for this statment.
Plan Details:
-------------
1) RETURN: (Return Result)
Cumulative Total Cost: 73.0298
Cumulative CPU Cost: 252522
Cumulative I/O Cost: 11.392
Cumulative Re-Total Cost: 53.794
Cumulative Re-CPU Cost: 178061
Cumulative Re-I/O Cost: 8.39196
Cumulative First Row Cost: 73.0298
Estimated Bufferpool Buffers: 12.392

Arguments:
---------
BLDLEVEL: (Build level)
DB2 v8.1.2.136 : special_19546
HEAPUSE : (Maximum Statement Heap Usage)
56 Pages
STMTHEAP: (Statement heap size)
4096

Input Streams:
-------------
4) From Operator #2

Estimated number of rows:
8.39196
Number of columns: 0
Subquery predicate ID: Not
Applicable
2) DELETE: (Delete)
Cumulative Total Cost: 73.0298
Cumulative CPU Cost: 252522
Cumulative I/O Cost: 11.392
Cumulative Re-Total Cost: 53.794
Cumulative Re-CPU Cost: 178061
Cumulative Re-I/O Cost: 8.39196
Cumulative First Row Cost: 73.0298
Estimated Bufferpool Buffers: 12.392

Input Streams:
-------------
2) From Operator #3

Estimated number of rows:
8.39196
Number of columns: 2
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q3.$C0+Q3.$C1
Output Streams:
--------------
3) To Object RTM.TBL_COLLECT _PSSTAT_WIN_RTM

Estimated number of rows: 2.918e
+06
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q1.$RID$

4) To Operator #1

Estimated number of rows:
8.39196
Number of columns: 0
Subquery predicate ID: Not
Applicable
3) IXSCAN: (Index Scan)
Cumulative Total Cost: 19.2689
Cumulative CPU Cost: 143426
Cumulative I/O Cost: 3
Cumulative Re-Total Cost: 0.0331183
Cumulative Re-CPU Cost: 68965.2
Cumulative Re-I/O Cost: 0
Cumulative First Row Cost: 19.245
Estimated Bufferpool Buffers: 4

Arguments:
---------
MAXPAGES: (Maximum pages for prefetch)
1
PREFETCH: (Type of Prefetch)
NONE
ROWLOCK : (Row Lock intent)
UPDATE
SCANDIR : (Scan Direction)
FORWARD
TABLOCK : (Table Lock intent)
INTENT EXCLUSIVE

Predicates:
----------
2) Sargable Predicate
Relational Operator: Less Than or
Equal (<=)
Subquery Input Required: No
Filter Factor: 0.333333

Predicate Text:
--------------
(:? <= :?)

3) Stop Key Predicate
Relational Operator: Less Than or
Equal (<=)
Subquery Input Required: No
Filter Factor: 0.0200165

Predicate Text:
--------------
(Q2.COLLECT_TIM E <= :?)

4) Start Key Predicate
Relational Operator: Less Than or
Equal (<=)
Subquery Input Required: No
Filter Factor: 0.0200165

Predicate Text:
--------------
(:? <= Q2.COLLECT_TIME )

5) Start Key Predicate
Relational Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 0.00143678

Predicate Text:
--------------
(Q2.MACHINE_ID = :?)

5) Stop Key Predicate
Relational Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 0.00143678

Predicate Text:
--------------
(Q2.MACHINE_ID = :?)
Input Streams:
-------------
1) From Object DB2INST1.IX_PS_ WIN

Estimated number of rows: 2.918e
+06
Number of columns: 3
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q2.$RID$+Q2.MA CHINE_ID
+Q2.COLLECT_TIM E
Output Streams:
--------------
2) To Operator #2

Estimated number of rows:
8.39196
Number of columns: 2
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q3.$C0+Q3.$C1
Objects Used in Access Plan:
---------------------------

Schema: DB2INST1
Name: IX_PS_WIN
Type: Index
Time of creation:
2008-07-11-18.07.56.326366
Last statistics update:
2008-09-06-10.49.52.692970
Number of columns: 4
Number of rows: 2917997
Width of rows: -1
Number of buffer pool pages: 27158
Distinct row values: Yes
Tablespace name: PSIDXWIN
Tablespace overhead: 6.000000
Tablespace transfer rate: 0.400000
Source for statistics: Single Node
Prefetch page count: 128
Container extent page count: 32
Index clustering statistic: 0.941143
Index leaf pages: 51075
Index tree levels: 4
Index full key cardinality: 2917997
Index first key cardinality: 696
Index first 2 keys cardinality: 28598
Index first 3 keys cardinality: 2917997
Index first 4 keys cardinality: 2917997
Index sequential pages: 51074
Index page density: 98
Index avg sequential pages: 51074
Index avg gap between sequences:0
Index avg random pages: 0
Fetch avg sequential pages: -1
Fetch avg gap between sequences:-1
Fetch avg random pages: -1
Index RID count: 2917997
Index deleted RID count: 0
Index empty leaf pages: 0
Base Table Schema: RTM
Base Table Name:
TBL_COLLECT_PSS TAT_WIN_RTM
Columns in index:
MACHINE_ID
COLLECT_TIME
PROCESS_PID
PROCESS_NAME

Schema: RTM
Name: TBL_COLLECT_PSS TAT_WIN_RTM
Type: Table
Time of creation:
2008-01-17-11.51.05.399878
Last statistics update:
2008-09-06-10.49.52.692970
Number of columns: 12
Number of rows: 2917997
Width of rows: 50
Number of buffer pool pages: 27158
Distinct row values: No
Tablespace name: PSDATWIN
Tablespace overhead: 6.000000
Tablespace transfer rate: 0.400000
Source for statistics: Single Node
Prefetch page count: 128
Container extent page count: 32
Table overflow record count: 0
Table Active Blocks: -1
Sep 19 '08 #6
Just ideas... You might as well already have ruled them out...

One way to delete rows fast ist having them clustered by an MDC-index
(can be one-dimensional) and the registry variable DB2_MDC_ROLLOUT
being set (since DB2 V8.2 I believe). Just heard about it and I would
be interested in how much deletes are sped up by this in reality...

Another way of course is DB2 V9 partitioning and dropping partitions.
Sep 19 '08 #7
There are always contention on bufferpool and index root and leaf
pages between delete and insert whenever delete starts because delete
is always to delete the latest rows, those are rows that the insert
process has just inserted. So the performance degradation is expected.

RCT table maybe is worth to try, at least less contention. But there
are many restrictions on RCT table.

Sep 21 '08 #8
I don't really understand what you want to do with these steps...:
1) permanent (and concurrent) inserts with actual timestamp
2) every hour read / summarize - until here I understand - then delete
all until "now" - this I don't understand... because you will delete
un"processed/summarized" rows.

And also: You create an index and then a primary key with the same
columns... this throws an error, because the same columns can't be
used twice for an index.

Just my few cents...

On Sep 18, 10:21*pm, Michel Esber <mic...@us.auto matos.comwrote:
Hello,

Env: DB2 V8 LUW FP16 running Linux

create Table X (machine_id varchar(24) not null, ctime timestamp not
null);
create index iFoo on X (MACHINE_ID, CTIME) allow reverse scans;
alter table X add primary key (MACHINE_ID, CTIME);

Our C++ application inserts data into a table X using CLI array insert
with very good throughput.

On an hourly basis, another application (running on a different
server) reads the last 'hour':

select * from T where MACHINE_ID = ? and ctime between current
timestamp - 1 hour and current timestamp for read only;

After that, it starts a delete loop:

while (SQLCODE <100 )
do
delete from (select 1 from T where MACHINE_ID = ? and CTIME between
current timestamp - 1 hour and current timestamp fetch first 2000 rows
only) as B
done

The deletes have severe impact on my insert rate/throughput. I can see
number 3 times slower, or even worse. There are absolutely *NO* lock-
waits.

Question: Should I expect this performance degradation or am I missing
something ? Maybe there is a better technique to delete data.

Any thoughts ?

Thanks in advance,
Sep 24 '08 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

16
2907
by: aurora | last post by:
Hello! Just gone though an article via Slashdot titled "The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software" http://www.gotw.ca/publications/concurrency-ddj.htm]. It argues that the continous CPU performance gain we've seen is finally over. And that future gain would primary be in the area of software concurrency taking advantage hyperthreading and multicore architectures. Perhaps something the Python interpreter...
3
2451
by: Suzanne | last post by:
Hi All I'm having problems getting my data adapter to throw a concurrency exception with an INSERT command. I want to throw a concurrency exception if an attempt is made to enter a row into tb_table when a row with the same int_UID already exists in there. Here is my stored procedure: if not exists (select int_UID from tb_table where int_UID = @aint_UID)
32
3220
by: Christopher Benson-Manica | last post by:
Is the following code legal, moral, and advisable? #include <iostream> class A { private: int a; public: A() : a(42) {}
4
3832
by: Steven Nagy | last post by:
Hi Have a problem that consistantly occurs in my applications where using a DataAdapter (OLEDB) and a dataset. Using a simple process of adding a row to the dataset table and then calling the dataAdapter.Update method. The row adds. However if I delete that new row immediately via the dataset, I get a concurrency violation. By deleting via dataset, I mean, dataset.table(rownum).Delete() then once again calling the
2
3491
by: BobAchgill | last post by:
Do you know why this error might be happening? Maybe it is because I have two data adapters open on the same MDB file?? If this is the problem... how can I keep them from disturbing each other? An unhandled exception of type 'System.Data.DBConcurrencyException' occurred in system.data.dll Additional information: Concurrency violation: the
4
1550
by: papaja | last post by:
Hi, I posted this on asp.net forums already, but nobody answered it. Here is direct link: http://forums.asp.net/1124640/ShowPost.aspx. Here is the question: I'm using this code in delete command of datagrid: **************** Dataset1.DSataTable1.Rows(e.Item.ItemIndex).Delete() adp.Update(DataSet1)
9
2039
by: corey.coughlin | last post by:
Alright, so I've been following some of the arguments about enhancing parallelism in python, and I've kind of been struck by how hard things still are. It seems like what we really need is a more pythonic approach. One thing I've been seeing suggested a lot lately is that running jobs in separate processes, to make it easy to use the latest multiprocessor machines. Makes a lot of sense to me, those processors are going to be more and...
2
1403
by: John | last post by:
Hi During db development one of the most frustrating thing is the Concurrency Violation error which seems to be a general error message for any error during saving of record(s). How can one debug such an error and get more specific information on the underlying reason for error and the fields/records responsible for it? Thanks
6
1694
by: Jordan S. | last post by:
Using .NET 3.5... 1. Will ASP.NET guarantee that no more than ONE instance of a custom HTTP handler factory (class implementing IHttpHandlerFactory) is ever created in an ASP.NET application? 2. Assuming the answer to #1 is "yes - never more than one instance" (I hope that's true), then does ASP.NET automatically take care of concurrency with respect to accessing that one instance?
0
9892
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9734
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10654
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
10347
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9493
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7888
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5729
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
4540
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
4130
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.