473,799 Members | 3,061 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Odd query hang

Doozie of a problem here - may open a ticket with support, but wanted
to see what others had to say first...

We have an AIX 5.2 server with DB2 V8 FP5 and 32-bit instance. ~45GB
db. Reorgs and runstats are done weekly and the server is at least
decently tuned, although we're bumping up against the memory limits
with 32-bit. There is a series of 16 queries run in varying orders by
the users referred to as "mega-measures". Normally these queries
return their results (currently 9,126 rows) in a couple of minutes or
less each. Starting Sunday, some of them started to hang in a
non-uniform pattern. I am currently working with 3 queries on a test
system. Each of the queries runs successfully when run individually
(using db2 -tvf on a script). When all three are run in one of these
orders (query1, query2, query3) or (query3, query1, query2) in the
same script, the query that runs last hangs (see below for what I mean
by "hangs" in this case). I have user reports of these queries being
run individually hanging, but have not been able to reproduce that. If
I run the queries in this order (query3, query2, query1), they all
suceed. Also if I run the queries in any order from seperate
scripts(even within the same connection), they all succeed. These
results are re-creatable even after the db has been restarted, and
happen on both test and production (different hardware).

Changes to the system in the last week: APP_CTL_HEAP_SZ increased from
512 to 4096, LOCKLIST increased from 4096 to 8192, MAXFILOP increased
from 512 to 1024, TXNISOLATION cli config parameter changed from
default to 1(i.e CLI bound with UR). I cannot replicate the problem on
a system that does not have these 4 changes, but is identical in other
ways except it has a bit more data.

Definition of query hang in this situation: Query does not return,
even if left to run for 8 hours. No errors are returned or written to
the db2diag.log. List applications shows the application in an
"executing" status, but snapshots show no change in activity over the
course of time. No unusual CPU, Memory, or IO activity is aparent.

So - is it related to one of the 4 changes I made? If so, which one,
and why? If not, what the heck is going on here?

Thanks,

Ember Crooks
IBM Certified Database Administrator
DB2 UDB V8.1 for Linux, Unix, and Windows
Nov 12 '05 #1
3 2613
It seems to be a lock contention issue...since the problem is
reproducable only for specific order of execution, I would suggest
taking snapshots for locks for the affected database and have a look
at the output...

da****@hotmail. com (Ember) wrote in message news:<7c******* *************** ***@posting.goo gle.com>...
Doozie of a problem here - may open a ticket with support, but wanted
to see what others had to say first...

We have an AIX 5.2 server with DB2 V8 FP5 and 32-bit instance. ~45GB
db. Reorgs and runstats are done weekly and the server is at least
decently tuned, although we're bumping up against the memory limits
with 32-bit. There is a series of 16 queries run in varying orders by
the users referred to as "mega-measures". Normally these queries
return their results (currently 9,126 rows) in a couple of minutes or
less each. Starting Sunday, some of them started to hang in a
non-uniform pattern. I am currently working with 3 queries on a test
system. Each of the queries runs successfully when run individually
(using db2 -tvf on a script). When all three are run in one of these
orders (query1, query2, query3) or (query3, query1, query2) in the
same script, the query that runs last hangs (see below for what I mean
by "hangs" in this case). I have user reports of these queries being
run individually hanging, but have not been able to reproduce that. If
I run the queries in this order (query3, query2, query1), they all
suceed. Also if I run the queries in any order from seperate
scripts(even within the same connection), they all succeed. These
results are re-creatable even after the db has been restarted, and
happen on both test and production (different hardware).

Changes to the system in the last week: APP_CTL_HEAP_SZ increased from
512 to 4096, LOCKLIST increased from 4096 to 8192, MAXFILOP increased
from 512 to 1024, TXNISOLATION cli config parameter changed from
default to 1(i.e CLI bound with UR). I cannot replicate the problem on
a system that does not have these 4 changes, but is identical in other
ways except it has a bit more data.

Definition of query hang in this situation: Query does not return,
even if left to run for 8 hours. No errors are returned or written to
the db2diag.log. List applications shows the application in an
"executing" status, but snapshots show no change in activity over the
course of time. No unusual CPU, Memory, or IO activity is aparent.

So - is it related to one of the 4 changes I made? If so, which one,
and why? If not, what the heck is going on here?

Thanks,

Ember Crooks
IBM Certified Database Administrator
DB2 UDB V8.1 for Linux, Unix, and Windows

Nov 12 '05 #2
That's what I thought at first too - that's why I tried adding commits
and adding "for read only with ur" to the queries, and neither helped.
When I took and looked at lock snapshots in between the queries when I
was the only one on the system, It reported 0 locks currently held,
and the problem still occured. Also locktimeout is set to 60, so I
know it's not a lock wait (plus the app is in 'executing' status).

I let it run for 12 hours last night and it still didn't return.

I did verify with the customer - they indicated this is not a new
problem - it just seemed intermittent when they were in system test,
since the order of the queries matters, and they run them in varying
orders. So the problem has probably been there all along - since the
system was built and the queries were first tried.

Thanks!
sa************* *@marlabs.com (Passcore) wrote in message news:<ac******* *************** ****@posting.go ogle.com>...
It seems to be a lock contention issue...since the problem is
reproducable only for specific order of execution, I would suggest
taking snapshots for locks for the affected database and have a look
at the output...

da****@hotmail. com (Ember) wrote in message news:<7c******* *************** ***@posting.goo gle.com>...
Doozie of a problem here - may open a ticket with support, but wanted
to see what others had to say first...

We have an AIX 5.2 server with DB2 V8 FP5 and 32-bit instance. ~45GB
db. Reorgs and runstats are done weekly and the server is at least
decently tuned, although we're bumping up against the memory limits
with 32-bit. There is a series of 16 queries run in varying orders by
the users referred to as "mega-measures". Normally these queries
return their results (currently 9,126 rows) in a couple of minutes or
less each. Starting Sunday, some of them started to hang in a
non-uniform pattern. I am currently working with 3 queries on a test
system. Each of the queries runs successfully when run individually
(using db2 -tvf on a script). When all three are run in one of these
orders (query1, query2, query3) or (query3, query1, query2) in the
same script, the query that runs last hangs (see below for what I mean
by "hangs" in this case). I have user reports of these queries being
run individually hanging, but have not been able to reproduce that. If
I run the queries in this order (query3, query2, query1), they all
suceed. Also if I run the queries in any order from seperate
scripts(even within the same connection), they all succeed. These
results are re-creatable even after the db has been restarted, and
happen on both test and production (different hardware).

Changes to the system in the last week: APP_CTL_HEAP_SZ increased from
512 to 4096, LOCKLIST increased from 4096 to 8192, MAXFILOP increased
from 512 to 1024, TXNISOLATION cli config parameter changed from
default to 1(i.e CLI bound with UR). I cannot replicate the problem on
a system that does not have these 4 changes, but is identical in other
ways except it has a bit more data.

Definition of query hang in this situation: Query does not return,
even if left to run for 8 hours. No errors are returned or written to
the db2diag.log. List applications shows the application in an
"executing" status, but snapshots show no change in activity over the
course of time. No unusual CPU, Memory, or IO activity is aparent.

So - is it related to one of the 4 changes I made? If so, which one,
and why? If not, what the heck is going on here?

Thanks,

Ember Crooks
IBM Certified Database Administrator
DB2 UDB V8.1 for Linux, Unix, and Windows

Nov 12 '05 #3
Turns out to be APAR IY54318 - a problem with the same (or in this
case very similar) queries executed one after another, and traces back
to a problem with communication between the subagents when
INTRA_PARALLEL is on. Turning INTRA_PARALELL off or using a degree of
parallelism of 1 for the queries is the short term workaround. The fix
is apparently an architectural one, and will be available in db2 8.2 -
db2 support has yet to get back to me on whether it is possible to get
this fix for 8.1.

Thanks for your input!
da****@hotmail. com (Ember) wrote in message news:<7c******* *************** ***@posting.goo gle.com>...
Doozie of a problem here - may open a ticket with support, but wanted
to see what others had to say first...

We have an AIX 5.2 server with DB2 V8 FP5 and 32-bit instance. ~45GB
db. Reorgs and runstats are done weekly and the server is at least
decently tuned, although we're bumping up against the memory limits
with 32-bit. There is a series of 16 queries run in varying orders by
the users referred to as "mega-measures". Normally these queries
return their results (currently 9,126 rows) in a couple of minutes or
less each. Starting Sunday, some of them started to hang in a
non-uniform pattern. I am currently working with 3 queries on a test
system. Each of the queries runs successfully when run individually
(using db2 -tvf on a script). When all three are run in one of these
orders (query1, query2, query3) or (query3, query1, query2) in the
same script, the query that runs last hangs (see below for what I mean
by "hangs" in this case). I have user reports of these queries being
run individually hanging, but have not been able to reproduce that. If
I run the queries in this order (query3, query2, query1), they all
suceed. Also if I run the queries in any order from seperate
scripts(even within the same connection), they all succeed. These
results are re-creatable even after the db has been restarted, and
happen on both test and production (different hardware).

Changes to the system in the last week: APP_CTL_HEAP_SZ increased from
512 to 4096, LOCKLIST increased from 4096 to 8192, MAXFILOP increased
from 512 to 1024, TXNISOLATION cli config parameter changed from
default to 1(i.e CLI bound with UR). I cannot replicate the problem on
a system that does not have these 4 changes, but is identical in other
ways except it has a bit more data.

Definition of query hang in this situation: Query does not return,
even if left to run for 8 hours. No errors are returned or written to
the db2diag.log. List applications shows the application in an
"executing" status, but snapshots show no change in activity over the
course of time. No unusual CPU, Memory, or IO activity is aparent.

So - is it related to one of the 4 changes I made? If so, which one,
and why? If not, what the heck is going on here?

Thanks,

Ember Crooks
IBM Certified Database Administrator
DB2 UDB V8.1 for Linux, Unix, and Windows

Nov 12 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
3457
by: Ryan Govostes | last post by:
Eh, I was wondering if anyone could help me with a SQL query problem I'm having. I'm a complete newbie to SQL and MySQL, so any help would be greatly appreciated. I am using PHP to allow users to submit ISBNs, prices, and item conditions into a MySQL database. The table the values are inserted into is declared with CREATE TABLE prices (isbn VARCHAR(10), price FLOAT UNSIGNED, conditions VARCHAR(5))
4
6208
by: Madhu Gopinathan | last post by:
Hi All, I am faced with a horrible hang problem. I have a COM exe server that executes some tasks. The task execution manager is a thread that manages the pool of threads, which is 4 per processor. Each task is processed in a separate thread. Each of the executer threads is an STA thread, and it goes ahead and executes the task. No problems are encountered when tasks are executed one at a time, but when multiple tasks are executed...
4
1634
by: darrel | last post by:
I'm still struggling to find out why I keep getting Invalid object name 'monkey'. errors. Below is my try/catch statement. Note the first query is trying to delete a record in the table MONKEY. If I omit this, it performs the second query just fine, but it always hang on this first one. I've copied this MONKEY table into a new, differently named table, but still get the same error. I can run this query within Enterprise manager, and...
1
537
by: Ron | last post by:
Hi, Does a make table query have limitations that a select query doesn't have? This is my problem. I've got a make table query I'll call qrymktblTemp--it combines a few different tables and queries to form a new table that I then manipulate in various ways. It's worked well for all the records that qualify (like, it only includes records from tblBills that have a blank in "printed" field and leaves out ones that the "printed" field...
5
2417
by: marcsirois | last post by:
I have an SQL Query that looks like this. I simplified it for the purpose of this example, but it's still a runnable query SELECT TOP 1 FundedPositions.PositionNumber AS , (select top 1 assign_i from dbo_assign a where a.scenario_i=1 and a.assign_i = TestMyFunction(FundedPositions.PositionNumber))AS
7
6721
by: Jlo | last post by:
Hi, I have a c# winforms application. When I call the report file, it shows me all the records in the table. How can I make it to call only a particular range. i have the following code Viewer1.ReportSource = Application.StartupPath + "//Label.rpt"; How can I assign it a dataset which have the data of a particular range.
12
2116
by: bhipwell via AccessMonster.com | last post by:
Hello, I have hit the "Cannot open any more databases" and "System resource exceeded" errors. Knew this was coming, but as I got closer to finishing the database, I hoped it wouldn't be an issue. My DB consists of hundreds of unique pieces of information (in atomic tables of course). However, I have many multiple queries that are required to compile information and conduct various calculations to put it in a workable format for...
10
2653
by: Lloyd Harold | last post by:
I'm very new to PHP and attempting to put together a simple script for retrieving MySQL data of personal records. The MySQL table I'm using consists of: 0: id 1: name 2: location (an integer relating to a separate table of locations). 3: details
7
6028
ollyb303
by: ollyb303 | last post by:
Hi, I am having a bit of a problem with TransferText macro. I am using TransferText, Export Delimited (no field names) to export the results of a query as a .csv file. The query is based on a linked table and uses a CDate() expression to convert the timestamp field (Nom_Date) from the table to a date type and the criteria: Between (Now()-183) And Now() to narrow results to the last 6 months. The query works just fine on its own and...
0
9686
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9540
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10475
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10250
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10222
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9068
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7564
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5463
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
3
2938
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.