Weird speed problem - Microsoft SQL Server

DBMonitor

I have a table on a database that contains 18million records. I need
to design a system that queries this table to produce fast counts.

I have got counts for multiple criteria down to only a few seconds.
Most take under a second however I have a few queries that seam to
take longer which I am working on reducing the time.

I have found some strange behavour in the way SQL Server works.

Take the following two queries which produce exactly the same result:

---------------------------------
select count(*)
from dbo.table
where column1='value1'
and column2='value2'
--------------------------------
and

--------------------------------
select count(*)
from (select id from table where column1 = 'value1') as value1
join (select id from table where column2 = 'value2') as value2
on value1.id = value2.id
---------------------------------

I would assume that the first query should run faster then the second
query. When I look at the query plans, they are almost identical cost
wise. The first takes about 53% of the cost and the second takes 47%.

Yet, the first query takes about 25 seconds to run and the second
takes only 5 seconds.

Does anyone know of a reason why there would be such a difference in
query speed?

Oct 25 '07 #1

Subscribe Post Reply

1744

DBMonitor

On Oct 25, 3:13 pm, DBMonitor <spamawa...@yahoo.com.auwrote:

I have a table on a database that contains 18million records. I need
to design a system that queries this table to produce fast counts.

I have got counts for multiple criteria down to only a few seconds.
Most take under a second however I have a few queries that seam to
take longer which I am working on reducing the time.

I have found some strange behavour in the way SQL Server works.

Take the following two queries which produce exactly the same result:

---------------------------------
select count(*)
from dbo.table
where column1='value1'
and column2='value2'
--------------------------------
and

--------------------------------
select count(*)
from (select id from table where column1 = 'value1') as value1
join (select id from table where column2 = 'value2') as value2
on value1.id = value2.id
---------------------------------

I would assume that the first query should run faster then the second
query. When I look at the query plans, they are almost identical cost
wise. The first takes about 53% of the cost and the second takes 47%.

Yet, the first query takes about 25 seconds to run and the second
takes only 5 seconds.

Does anyone know of a reason why there would be such a difference in
query speed?

BTW: Cubes are not an option in this senario.

Oct 25 '07 #2

Roy Harvey (SQL Server MVP)

My guess - and I can only guess from the given information - is that
there is an index on dbo.table(column1), and an index on
dbo.table(column2), but no index on dbo.table(column1,column2). In
the first example the optimizer chooses to use one of the indexes,
then has to scan to resolve the other test. In the second version of
the query the optimizer is using both indexes and then matching up the
results.

If this combination of columns is queried often you might consider
changing one of the two indexes to add the other column as the second
column of the index. So you might end up with the two indexes as
(column1), and (column2,column1). You would not need to indexes with
the same first column.

Roy Harvey
Beacon Falls, CT

On Wed, 24 Oct 2007 22:13:11 -0700, DBMonitor
<sp********@yahoo.com.auwrote:

>I have a table on a database that contains 18million records. I need
to design a system that queries this table to produce fast counts.

I have got counts for multiple criteria down to only a few seconds.
Most take under a second however I have a few queries that seam to
take longer which I am working on reducing the time.

I have found some strange behavour in the way SQL Server works.

Take the following two queries which produce exactly the same result:

---------------------------------
select count(*)
from dbo.table
where column1='value1'
and column2='value2'
--------------------------------
and

--------------------------------
select count(*)
from (select id from table where column1 = 'value1') as value1
join (select id from table where column2 = 'value2') as value2
on value1.id = value2.id
---------------------------------

I would assume that the first query should run faster then the second
query. When I look at the query plans, they are almost identical cost
wise. The first takes about 53% of the cost and the second takes 47%.

Yet, the first query takes about 25 seconds to run and the second
takes only 5 seconds.

Does anyone know of a reason why there would be such a difference in
query speed?

Oct 25 '07 #3

Hugo Kornelis

On Wed, 24 Oct 2007 22:13:11 -0700, DBMonitor wrote:

>Does anyone know of a reason why there would be such a difference in
query speed?

Hi DBMonitor,

In addition to the suggestion posted by Roy, have you considered that
this may be caused by cachhing? In other words, if you run the second
query first and the first query last, or if you run both queries
repeatedly, does that change anything to the execution time?

--
Hugo Kornelis, SQL Server MVP
My SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis

Oct 25 '07 #4

Erland Sommarskog

DBMonitor (sp********@yahoo.com.au) writes:

Both query plans use indexes on for both the tables though. The plans
are almost identical.

"Almost". Apparently, there is a subtle, but important difference.

Would it be posible for you to post the query plans? If you are on SQL 2005
you can save the graphical execution plan in a file and post that in an
attachment. (Or put it on a web site with a link to it.)
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx

Oct 25 '07 #5

DBMonitor

On Oct 26, 7:39 am, Erland Sommarskog <esq...@sommarskog.sewrote:

DBMonitor (spamawa...@yahoo.com.au) writes:
Both query plans use indexes on for both the tables though. The plans
are almost identical.

"Almost". Apparently, there is a subtle, but important difference.

Would it be posible for you to post the query plans? If you are on SQL 2005
you can save the graphical execution plan in a file and post that in an
attachment. (Or put it on a web site with a link to it.)

The query plans are as follows:

-----------------
Subquery Query (Total cost 41.52%)
SELECT (0%)<-CS (0%)<-HM IJ (68%)<-IS Col1 (11%)
..................................<-IS Col2 (20%)

Standard Query (Total Cost 58.48%)
SELECT (0%)<-CS (0%)<-HM IJ (78%)<-IS Col1 (8%)
..................................<-IS Col2 (14%)

KEY:
CS - Comput Scalar
HM IJ - Hash Match/Inner Join
IS - Index Seek

-------------------------

The main difference is the hash match process. For the sub querys, the
row size is 15 for the query with the sub queries and 21 for the
standard query however the sub query query has more rows on it.

The only thing I can think of which is happening is some sort of page
io sharing problem. The database server is running on a virtual
machine and the last wait type for the queries are always
'PAGEIOLATCH_SH' and the process runs on one thread.

When I run it on a dedicated server, the times to run the queries are
almost identical and the lastwaittypes change to CXPACKET and uses
multiple threads.

Oct 26 '07 #6

Erland Sommarskog

DBMonitor (sp********@yahoo.com.au) writes:

The query plans are as follows:

-----------------
Subquery Query (Total cost 41.52%)
SELECT (0%)<-CS (0%)<-HM IJ (68%)<-IS Col1 (11%)
.................................<-IS Col2 (20%)

Standard Query (Total Cost 58.48%)
SELECT (0%)<-CS (0%)<-HM IJ (78%)<-IS Col1 (8%)
.................................<-IS Col2 (14%)

KEY:
CS - Comput Scalar
HM IJ - Hash Match/Inner Join
IS - Index Seek

I will have to confess that I hoped to see the full plans. Oh well.
Your choice.

When I run it on a dedicated server, the times to run the queries are
almost identical and the lastwaittypes change to CXPACKET and uses
multiple threads.

What happens if you add OPTION (MAXDOP 1) to force a non-parallel plan?

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx

Oct 26 '07 #7

Similar topics

Speed: bytecode vz C API calls

by: Jacek Generowicz | last post by:

I have a program in which I make very good use of a memoizer: def memoize(callable): cache = {} def proxy(*args): try: return cache except KeyError: return cache.setdefault(args,...

Python

off-topic: Why is lisp so weird?

by: nobody | last post by:

Howdy, Mike! mikecoxlinux@yahoo.com (Mike Cox) wrote in message news:<3d6111f1.0402271647.c20aea3@posting.google.com>... > I'm a C++ programmer, and have to use lisp because I want to use >...

C / C++

Weird behavior with window.open() and IE

by: Woody | last post by:

I opened a JPEG image in a separate window using window.open("mypic.jpg"). When the new window opened, I immediately saw the QuickTime media player load on the window. The QuickTime player then...

Javascript

Python code written in 1998, how to improve/change it?

by: Petr Jakes | last post by:

Hello, I am trying to study/understand OOP principles using Python. I have found following code http://tinyurl.com/a4zkn about FSM (finite state machine) on this list, which looks quite useful for...

Python

Weird behavoir while using function aliases

by: Daniel Rucareanu | last post by:

I have the following script: function Test(){} Test.F = function(){} Test.F.FF = function(){} Test.F.FF.FFF = function(){} Test.F.FF.FFF.FFFF = function(){} //var alias = function(){}; var...

Javascript

Weird marquee behavior in IE

by: Charles | last post by:

I have a news marquee which works well in most browsers: http://mhariolincoln.jor.br/test.html Problem is that when I include it in my web page, it stops scrolling when the line of text reaches...

Javascript

Weird problem in forms?

by: Henrootje | last post by:

I have a problem which I do not understand........................ I have a split database. Tables in the backend on a networkdrive, all other in frontend placed in %temp% I have this form ...

Microsoft Access / VBA

weird variable declarations

by: John Williams | last post by:

I've been spending some time expanding my rather primitive knowledge of c++ through various exercises. One of them that I've been working on is learning a bit about how compression works. I've...

C / C++

.mdb and TCP - weird problem

by: ongaro.admin | last post by:

Hi, I'm experiencing a strange problem with .mdb files. We have two buildings connected by optical fiber (a single LAN). Everything works perfect with any file, any size, any application...

Microsoft Access / VBA

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice