[How to] avoid cross product/Cartesian product to improve performance

Hi,

I have 1 table contains about 4 millions entries structure like below:

[Alarm History]
(
AlarmID int,
SetTime datetime
)

Now I want to :
SELECT all the AlarmID that happened during Jan 2008 and no such AlarmID during Dec 2007. I used:

SELECT * FROM [Alarm History]
WHERE SetTime BETWEEN '1-jan-2008' AND '31-jan-2008'
AND AlarmID NOT IN

(
SELECT AlarmID FROM [Alarm History]
WHERE SetTime BETWEEN '1-dec-2007' AND '31-DEC-2007'
)

My querry take very slow since there are more than 20,000 entries during Dec 2007 and 15,000 entries during Jan 2008,

"NOT IN" operation is like an Cartesian Product : Compare each AlarmID during December 2007 with each AlarmID during Jan 2008. The results is very slow performance.

Anybody can help me find an alternatives way to do this ? Thanks a lot.

Jan 31 '08 #1

Subscribe Post Reply

12556

code green

1,726

Expert 1GB

It is probably the sub-query that slows it down.
There is an equivalent JOIN looking for NULL instead of NOT IN

Expand|Select|Wrap|Line Numbers

 SELECT * FROM [Alarm History] Jan

LEFT JOIN [Alarm History] Dec ON (Jan.AlarmID = Dec.AlarmID

AND Dec.SetTime BETWEEN '1-dec-2007' AND '31-DEC-2007')

WHERE Jan.SetTime BETWEEN '1-jan-2008' AND '31-jan-2008'

AND IS NULL Dec.AlarmID

I think

Jan 31 '08 #2

Delerna

1,134

Expert 1GB

If you want to compare the effectiveness of two different methods of writing a query then

Open query analyser and paste the two queries into the query analyser window

Then click the display estimated execution plan button and this will display a graphical representation of how each query will execute

I did that for the 2 queries here, changing table and field names to suit one of my tables

Sorry to say that, at least in the check that I did, code green's method was 3 times slower. 28%(lightkeepers method) to 72% (code greens method)
That may be different when you try it on your own table ???

Also the execution plan will show you which parts of your query is spending the most time and therefow shows where you might be able to improve performance.
Look for loops and and high I/O costs when looking for performance bottlenecks

All the above is in relation to MS SQL Server. I guess other databse engines have something similar

Feb 1 '08 #3

code green

1,726

Expert 1GB

Wow! 3 times slower. That suprised me.
What is really slowing the query down is the date comparison.
I have tried similar queries in both formats on my online server (1and1).
They both timed out.
I was able to get around it because my table used auto-ids.
I then used SQL variables to get the IDs of the minimum and maximum dates.
Then did a SELECT comparing IDs rather than DATE.
Very fast.

Expand|Select|Wrap|Line Numbers

  //Get the IDs of earliest and latest dates

SELECT @earliest :=   MAX(AlarmID)  FROM [Alarm History]

WHERE SetTime >= '1-jan-2008'; 

SELECT @latest :=   MAX(AlarmID)  FROM [Alarm History]

WHERE SetTime <= '31-jan-2008'
 
Use these to filter the main query

SELECT AlarmID FROM [Alarm History]

WHERE AlarmID > @earliest AND AlarmID< @latest 

AND  ....

I kow this is MySql but could you adapt this idea to your table?

Feb 1 '08 #4

Delerna

1,134

Expert 1GB

Yea it surprised me also as I also thought that the subquery was the problem but it seems that SQL Server does a prettey good job of executing it. I also tried a method of my own and was beaten 48% to 52%. I personally have not used a "where not in" query myself but I think I will be taking a closer look at it in the future.

One thing that may help the speed of this query is indexes I have seen slow queries that had nothing wrong with the way it was written. The sheer number of records was the cause. Well thought out indexes took those queries from minutes to a few seconds.

Feb 1 '08 #5

ozone702

use a left outer join to your "not in" table, group by some columns in table a and at least one column in table b (the "not in" table) having b.some_column is null.

This is much more effecient than a not in statement.

i.e.

Expand|Select|Wrap|Line Numbers

 select a.col1, a.col2

from tablea as a

left outer join tableb as b

  on b.fkey_id = a.id

group by a.col1, a.col2, b.some_column

having b.some_column is null

Jul 16 '13 #6

Similar topics

cartesian product...

by: deancoo | last post by:

I need to do a Cartesian product, which is inherently expensive. Turns out, it's too expensive. I've dropped in that portion of my C++ code in hopes that someone with greater expertise with STL...

C / C++

Cartesian product bug?

by: Ã…smund Kveim Lie | last post by:

Hi, We have found a possible bug in 7.3.1. It seems that using CROSS JOIN and doing plain Cartesian product, listing to tables in the from clause, gives different results. According to the...

PostgreSQL Database

Eliminating records from a query that produces a Cartesian Product

by: Eric Slan | last post by:

Hello All: I'm having a problem that's been baffling me for a few days and I seek counsel here. I have an Access 2000 DB from which I want to run several reports. These reports are...

Microsoft Access / VBA

Problem with Queries and Cartesian Product Joins

by: John Smith | last post by:

Isn't life a bitch! You know what you want but you don't know how to get it. I have produced 12 queries that calculate a payment profile over 12 months. For a number of the records (ie with...

Microsoft Access / VBA

Modifying subform records for cartesian product

by: manning_news | last post by:

Using A2K. I've got a database with client info and each client has a subform which contains types of income and the amount they each receive. The record source of the subform is a cartesian...

Microsoft Access / VBA

Improve Performance - Tools?

by: Lakesider | last post by:

Hi NG, I have written an application with a lot of file- and database operations. There are several algorithmic operations, too. My question is: are ther any tools to improve performance -...

C# / C Sharp

Improve performance using masterpages and base classes?

by: Swami | last post by:

I have 2 questions relating to website design in asp .net: 1. In a website that I am building I have everything as a user control. Even the header, which contains the navigation tabs is in a user...

ASP.NET

Cartesian product

by: zfareed | last post by:

I have a program that creates two sets, one thru user interaction and the other with the use of an array. Can anyone help with coding for finding the cartesian product of the two sets; i.e a...

C / C++

Why do templates improve performance?

by: Gilles Ganault | last post by:

Hello I'm no PHP expert, and I'm reading "Building scalable web sites". In the tips section, the author mentions using templates to speed things up. I was wondering how the template engines...

PHP

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

Microsoft Access / VBA