473,657 Members | 2,421 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Performance problem with correlated sub-query

I have created a web app that stores and displays all the messages from
my database maintenance jobs that run each night. The web app uses Java
servlets and has PostgreSQL 7.0 as the back end.



When the user requests the first page, he gets a list of all the servers
with maintenance records in the database, and a drop down list of all
the dates of maintenance records. If the user chooses a date first, then
the app uses a prepared statement with the date contained in a
parameter, and this executes very quickly - no problems.



However, if the web page user does not choose a date, then the app uses
a correlated sub-query to grab only the current (latest) day's
maintenance records. The query that is executed is:



select servername, databasename, message from messages o where
o.date_of_msg =

(select max(date_of_msg ) from messages i where i.servername
= o.servername);



And this is a dog. It takes 15 - 20 minutes to execute the query (there
are about 200,000 rows in the table). I have an index on (servername,
date_of_msg), but it doesn't seem to be used in this query.



Is there a way to improve the performance on this query?



Thanks,



Steve Howard




This message (including any attachments) contains confidential information intended for a specific individual and purpose, and is protected by law. If you are not the intended recipient, you should delete this message. Any disclosure, copying, or distribution of this message, or the taking of any action based on it, is strictly prohibited.

Nov 23 '05 #1
8 2055
Howard, Steven (US - Tulsa) wrote:
select servername, databasename, message from messages o where
o.date_of_msg = (select max(date_of_msg ) from messages i where
i.servername = o.servername);

And this is a dog. It takes 15 – 20 minutes to execute the
query (there are about 200,000 rows in the table). I have an
index on (servername, date_of_msg), but it doesn’t seem to
be used in this query.


Just off the top of my head:

SELECT servername, databasename, message
FROM messages o
WHERE o.date_of_msg = (
SELECT date_of_msg
FROM messages i
WHERE i.servername = o.servername
ORDER BY date_of_msg
LIMIT 1
);
HTH,

Mike Mascari

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 23 '05 #2
Howard, Steven (US - Tulsa) wrote:
select servername, databasename, message from messages o where
o.date_of_msg = (select max(date_of_msg ) from messages i where
i.servername = o.servername);

And this is a dog. It takes 15 – 20 minutes to execute the
query (there are about 200,000 rows in the table). I have an
index on (servername, date_of_msg), but it doesn’t seem to
be used in this query.


Just off the top of my head:

SELECT servername, databasename, message
FROM messages o
WHERE o.date_of_msg = (
SELECT date_of_msg
FROM messages i
WHERE i.servername = o.servername
ORDER BY date_of_msg
LIMIT 1
);
HTH,

Mike Mascari

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 23 '05 #3
On Thu, 29 Apr 2004, Howard, Steven (US - Tulsa) wrote:
I have created a web app that stores and displays all the messages from
my database maintenance jobs that run each night. The web app uses Java
servlets and has PostgreSQL 7.0 as the back end.
Step 1 is upgrade. ;)
However, if the web page user does not choose a date, then the app uses
a correlated sub-query to grab only the current (latest) day's
maintenance records. The query that is executed is:

select servername, databasename, message from messages o where
o.date_of_msg =

(select max(date_of_msg ) from messages i where i.servername
= o.servername);
This is likely to be running the subquery once for each row in messages,
and probably not going to use an index in the inner either. The former
might be optimized by recent versions.

Changing the inner query to something like:
(select date_of_msg from messages i where i.servername=o. servername
order by date_of_msg desc limit 1)

or changing it to use a subselect in from (something like):
from messages o, (select servername, max(date_of_msg ) from messages) i
where o.servername=i. servername

might both help, but I'm not sure either will work on 7.0.
And this is a dog. It takes 15 - 20 minutes to execute the query (there
are about 200,000 rows in the table). I have an index on (servername,
date_of_msg), but it doesn't seem to be used in this query.


You might wish to play around with changing the indexes and the order of
the columns in the multicolumn index as well.

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 23 '05 #4
On Thu, 29 Apr 2004, Howard, Steven (US - Tulsa) wrote:
I have created a web app that stores and displays all the messages from
my database maintenance jobs that run each night. The web app uses Java
servlets and has PostgreSQL 7.0 as the back end.
Step 1 is upgrade. ;)
However, if the web page user does not choose a date, then the app uses
a correlated sub-query to grab only the current (latest) day's
maintenance records. The query that is executed is:

select servername, databasename, message from messages o where
o.date_of_msg =

(select max(date_of_msg ) from messages i where i.servername
= o.servername);
This is likely to be running the subquery once for each row in messages,
and probably not going to use an index in the inner either. The former
might be optimized by recent versions.

Changing the inner query to something like:
(select date_of_msg from messages i where i.servername=o. servername
order by date_of_msg desc limit 1)

or changing it to use a subselect in from (something like):
from messages o, (select servername, max(date_of_msg ) from messages) i
where o.servername=i. servername

might both help, but I'm not sure either will work on 7.0.
And this is a dog. It takes 15 - 20 minutes to execute the query (there
are about 200,000 rows in the table). I have an index on (servername,
date_of_msg), but it doesn't seem to be used in this query.


You might wish to play around with changing the indexes and the order of
the columns in the multicolumn index as well.

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 23 '05 #5

On 29/04/2004 14:34 "Howard, Steven (US - Tulsa)" wrote:
I have created a web app that stores and displays all the messages from
my database maintenance jobs that run each night. The web app uses Java
servlets and has PostgreSQL 7.0 as the back end.
7.0? That's positively ancient!
When the user requests the first page, he gets a list of all the servers
with maintenance records in the database, and a drop down list of all
the dates of maintenance records. If the user chooses a date first, then
the app uses a prepared statement with the date contained in a
parameter, and this executes very quickly - no problems.

However, if the web page user does not choose a date, then the app uses
a correlated sub-query to grab only the current (latest) day's
maintenance records. The query that is executed is:

select servername, databasename, message from messages o where
o.date_of_msg =

(select max(date_of_msg ) from messages i where i.servername
= o.servername);

And this is a dog. It takes 15 - 20 minutes to execute the query (there
are about 200,000 rows in the table). I have an index on (servername,
date_of_msg), but it doesn't seem to be used in this query.
PG doesn't use indexes for things like count(), max, min()...

You can avoid using max() by something like

select my_date from my_table order by my_date desc limit 1;

which will use the index.

Is there a way to improve the performance on this query?


In addition to the above, I'd strongly recommend upgrading to 7.4 to take
advantage of the last ~4 years of continuous improvements.

--
Paul Thomas
+------------------------------+---------------------------------------------+
| Thomas Micro Systems Limited | Software Solutions for
Business |
| Computer Consultants |
http://www.thomas-micro-systems-ltd.co.uk |
+------------------------------+---------------------------------------------+
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 23 '05 #6

On 29/04/2004 14:34 "Howard, Steven (US - Tulsa)" wrote:
I have created a web app that stores and displays all the messages from
my database maintenance jobs that run each night. The web app uses Java
servlets and has PostgreSQL 7.0 as the back end.
7.0? That's positively ancient!
When the user requests the first page, he gets a list of all the servers
with maintenance records in the database, and a drop down list of all
the dates of maintenance records. If the user chooses a date first, then
the app uses a prepared statement with the date contained in a
parameter, and this executes very quickly - no problems.

However, if the web page user does not choose a date, then the app uses
a correlated sub-query to grab only the current (latest) day's
maintenance records. The query that is executed is:

select servername, databasename, message from messages o where
o.date_of_msg =

(select max(date_of_msg ) from messages i where i.servername
= o.servername);

And this is a dog. It takes 15 - 20 minutes to execute the query (there
are about 200,000 rows in the table). I have an index on (servername,
date_of_msg), but it doesn't seem to be used in this query.
PG doesn't use indexes for things like count(), max, min()...

You can avoid using max() by something like

select my_date from my_table order by my_date desc limit 1;

which will use the index.

Is there a way to improve the performance on this query?


In addition to the above, I'd strongly recommend upgrading to 7.4 to take
advantage of the last ~4 years of continuous improvements.

--
Paul Thomas
+------------------------------+---------------------------------------------+
| Thomas Micro Systems Limited | Software Solutions for
Business |
| Computer Consultants |
http://www.thomas-micro-systems-ltd.co.uk |
+------------------------------+---------------------------------------------+
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 23 '05 #7
Howard, Steven (US - Tulsa) wrote:
I have created a web app that stores and displays all the messages from
my database maintenance jobs that run each night. The web app uses Java
servlets and has PostgreSQL 7.0 as the back end.

When the user requests the first page, he gets a list of all the servers
with maintenance records in the database, and a drop down list of all
the dates of maintenance records. If the user chooses a date first, then
the app uses a prepared statement with the date contained in a
parameter, and this executes very quickly – no problems.

However, if the web page user does not choose a date, then the app uses
a correlated sub-query to grab only the current (latest) day’s
maintenance records. The query that is executed is:

select servername, databasename, message from messages o where
o.date_of_msg =

(select max(date_of_msg ) from messages i where i.servername
= o.servername);

And this is a dog. It takes 15 – 20 minutes to execute the query (there
are about 200,000 rows in the table). I have an index on (servername,
date_of_msg), but it doesn’t seem to be used in this query.


Few basic checks..
- What does explain analyze says for the slow query?
- Have you vacuumed and analyzed recently?
- Have you done basic optimisations from default state? Check
http://www.varlena.com/varlena/Gener...bits/perf.html and
http://www.varlena.com/varlena/Gener...ed_conf_e.html

And 7.0 is way too old. If you can afford to upgrade, upgrade to 7.4.2.

HTH

Shridhar
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #8
Howard, Steven (US - Tulsa) wrote:
I have created a web app that stores and displays all the messages from
my database maintenance jobs that run each night. The web app uses Java
servlets and has PostgreSQL 7.0 as the back end.

When the user requests the first page, he gets a list of all the servers
with maintenance records in the database, and a drop down list of all
the dates of maintenance records. If the user chooses a date first, then
the app uses a prepared statement with the date contained in a
parameter, and this executes very quickly – no problems.

However, if the web page user does not choose a date, then the app uses
a correlated sub-query to grab only the current (latest) day’s
maintenance records. The query that is executed is:

select servername, databasename, message from messages o where
o.date_of_msg =

(select max(date_of_msg ) from messages i where i.servername
= o.servername);

And this is a dog. It takes 15 – 20 minutes to execute the query (there
are about 200,000 rows in the table). I have an index on (servername,
date_of_msg), but it doesn’t seem to be used in this query.


Few basic checks..
- What does explain analyze says for the slow query?
- Have you vacuumed and analyzed recently?
- Have you done basic optimisations from default state? Check
http://www.varlena.com/varlena/Gener...bits/perf.html and
http://www.varlena.com/varlena/Gener...ed_conf_e.html

And 7.0 is way too old. If you can afford to upgrade, upgrade to 7.4.2.

HTH

Shridhar
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
1948
by: ajjvn | last post by:
I 'inherited' a group of SQL Server server class machines. They are true server technology but the disk sub-systems are lacking. There is one hot-swap backplane that all the drives share (with one SCSI channel) thusly even though there are three logical drives (composed from 6 to 8 hard drives), they all go through one channel. This is creating a performance issue that is noticable and can be seen in various performance counters that...
5
509
by: rc | last post by:
Hi We have a SQL server on Win2k. the physical size of the db is about 40G and the main table has approx 65m rows in it. At the moment the entire database is on one data file. The entire server including the OS is on a RAID 0 array with one RAID controller. My question is would I get any performance benefit if I was to have more than several data file rather than one big data file, bearing in mind that there is only one disk...
4
1902
by: Alex Callea | last post by:
Hi there, We have a web application handling thousands of requests per seconds reading sql server data which is heavily updated. We are generally experiencing no performance problems. On some occasions we get an increase of the traffic of about 15% for short periods. In this case we observe something really strange: our webserver CPU goes from about 40% usage to 100%, with memory usage keeping low (about 50%). We see at the same time...
13
1239
by: Kurt Schroeder | last post by:
does compiling the code behind source using studio give and performance boost? this is a rephrase of an a question i posted yesterday. To be honist VS.net is becomming a pain to work with for asp.net. Vb.net and c.net hey, no problem, but asp.net it seems to create way too much over head and complication. thanks for listening and would appreciate an opinion. thanks kes
4
4343
by: James Radke | last post by:
Hello, I am creating an owner draw listbox for a windows application. It is all working, except the performance is significantly slower than the standard listbox. Basically what I have done is added two new properties (full source below): ChangeBackgroundMember = a bound data field which contains a boolean as to whether this record should get a special background color ChangeBackgroundColor = the color to use when the above...
2
2520
by: Jonesgj | last post by:
Hi, I have a test box which I would like to monitor CPU usage and run queue during the day. I don't want to buy any 3rd party tool, if I can do it easily, as I only need to monitor the box's performance over a week. I thought I could just create a .Net app or service with a timer that gets back this data every so many minutes..
19
3142
by: Tom Jastrzebski | last post by:
Hello, I was just testing VB.Net on Framework.Net 2.0 performance when I run into the this problem. This trivial code attached below executed hundreds, if not thousand times faster in VB 6.0 than in .Net environment, under VS 2005 Beta 2. Does anyone have any idea whether this will be addressed in the final release? Thanks, Tomasz
48
4456
by: Alex Chudnovsky | last post by:
I have come across with what appears to be a significant performance bug in ..NET 2.0 ArrayList.Sort method when compared with Array.Sort on the same data. Same data on the same CPU gets sorted a lot faster with both methods using .NET 1.1, that's why I am pretty sure its a (rather serious) bug. Below you can find C# test case that should allow you to reproduce this error, to run it you will need to put 2 data files into current directory...
4
2953
by: muzu1232004 | last post by:
Can anyone explain me when we use correlated subqueries rather than nested subqueries. Do all the correlated subqueries can be written in nested subqueries form as well ? What are the major conditions that apply whenever we write a correlated subquery and why we go for it ? Please let me know about this as i am not clear which to use when.
10
3009
by: Rafael Cunha de Almeida | last post by:
Hi, I've found several sites on google telling me that I shouldn't use rand() % range+1 and I should, instead, use something like: lowest+int(range*rand()/(RAND_MAX + 1.0))
0
8399
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8827
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8732
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
6169
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5632
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4159
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4318
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
1959
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
2
1622
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.