Multiple join performance

Pino

Hi, to everybody,

let's consider this scenario: you have 1 data-table and 10
dictionary-tables; the data-table has 5 million records and 30 columns, 10
of these columns have a foreign-key to the dictionary-tables: the
dictionary-tables have (almsot all) only two columns (code and description)
and a low number of records (less than 100 in most cases, but one table has
1000 records and another one has 8000 records). When you perform a query on
the data-table you must show the descriptions taken from the
dictionary-tables; you have two options to do this:

1) A multiple join (11 tables) to get both main data and descriptions.
2) Load permanently the dictionary-tables in memory (using hashmaps) and
query only the data-table, then the application looks-up the descriptions
from the hashmaps. Consider that you are writing a web application, so the
hashmaps can be hold in application scope to be used by all the users (more
than 1000 users).

Which solution performs better? I think this scenario is rather common.
I'm a little afraid of doing a multiple join on a table that has 5 millions
of records.

Tell your opinion.
Pino

Nov 12 '05 #1

Subscribe Reply

5834

Serge Rielau

Pino,

you can do the same within DB2.
Keep the tables in a dedicated bufferpool.
Let DB2 figure out whether it wants to use hashjoins or something else.

The rule of thumb is: If it's relational let the DBMS deal with it.

Cheers
Serge

--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab

Nov 12 '05 #2

Mark Barinstein

I have another experience with w2k, v7, fp10a.
Some times ago I wrote a select statement on 6-7 tables with 200 000 - 300
000 rows and 9-10 dictionary tables with a few number of rows. This query
always returned 1 row (it was a contract information) and lasted about 1
minute. Most of this time was taken for a compilation (we used query
optimization 3). And I had to write SQL stored procedure where I splited
this query on 2 parts: first I populate global temporary table with select
only over 6-7 large tables and second I join this temporary table with
dictionary tables. This SQL SP executed about 5 seconds...
Conclusion: DB2 optimizer - great thing in common, but sometimes...

Best regards,
Mark.

Keep the tables in a dedicated bufferpool.
Let DB2 figure out whether it wants to use hashjoins or something else.

The rule of thumb is: If it's relational let the DBMS deal with it.

Nov 12 '05 #3

Serge Rielau

Mark,

Why did you use dynamic SQL?
If you could use a stored proc for the piecemeal you could equally have
written a proc with the whole join in it if you don't want to deal with
packages and the query doesn't execute often enough to stay in the cache.

Cheers
Serge
--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab

Nov 12 '05 #4

Mark Barinstein

> Why did you use dynamic SQL?

Sure, first of all I wrote this SP with static select, but it worked
considerable slower (as I remember ~ 30 sec opposite 3-5 sec), than this
splitted dynamic... (I didn't understand why and decided to use dynamic).

Nov 12 '05 #5

Tomas Hallin

You may want to look at the DB2_REDUCED_OPT IMIZATION registry
variable. Set it to an integer value, and DB2 drops the optimization
level down for dynamic queries joining more tables than specified in
that integer. Very useful.

/T

"Mark Barinstein" <ma**@crk.vsi.r u> wrote in message news:<ca******* ****@serv3.vsi. ru>...

I have another experience with w2k, v7, fp10a.
Some times ago I wrote a select statement on 6-7 tables with 200 000 - 300
000 rows and 9-10 dictionary tables with a few number of rows. This query
always returned 1 row (it was a contract information) and lasted about 1
minute. Most of this time was taken for a compilation (we used query
optimization 3). And I had to write SQL stored procedure where I splited
this query on 2 parts: first I populate global temporary table with select
only over 6-7 large tables and second I join this temporary table with
dictionary tables. This SQL SP executed about 5 seconds...
Conclusion: DB2 optimizer - great thing in common, but sometimes...

Best regards,
Mark.
Keep the tables in a dedicated bufferpool.
Let DB2 figure out whether it wants to use hashjoins or something else.

The rule of thumb is: If it's relational let the DBMS deal with it.

Nov 12 '05 #6

Mark Yudkin

I would use 1. I strongly suspect that 2 would be slower. It definitely
takes longer to code, and it fails to handle "concurrent " inserts and
updates to the dictionary tables (your hashmap can become out-of-date), so
that it would also be the source of various problems.

If join performance were to become a problem - which is not likely in your
simple join scenario - then you can resolve this by using a materialized
query table (see CREATE TABLE in the SQL reference).

"Pino" <no****@novirus .invalid> wrote in message
news:gn******** ************@tw ister2.libero.i t...

Hi, to everybody,

let's consider this scenario: you have 1 data-table and 10
dictionary-tables; the data-table has 5 million records and 30 columns, 10
of these columns have a foreign-key to the dictionary-tables: the
dictionary-tables have (almsot all) only two columns (code and description) and a low number of records (less than 100 in most cases, but one table has 1000 records and another one has 8000 records). When you perform a query on the data-table you must show the descriptions taken from the
dictionary-tables; you have two options to do this:

1) A multiple join (11 tables) to get both main data and descriptions.
2) Load permanently the dictionary-tables in memory (using hashmaps) and
query only the data-table, then the application looks-up the descriptions
from the hashmaps. Consider that you are writing a web application, so the hashmaps can be hold in application scope to be used by all the users (more than 1000 users).

Which solution performs better? I think this scenario is rather common.
I'm a little afraid of doing a multiple join on a table that has 5 millions of records.

Tell your opinion.
Pino

Nov 12 '05 #7

Mark Barinstein

Tomas,

I don't like optimization levels less than 3. It often leads to stupid query
plans for queries with 5-6 quite large tables... Besides our system is both
OLTP and DSS, and I think we cant't use this variable...

Best regards,
Mark.

"Tomas Hallin" <ja*****@hotmai l.com> ???????/???????? ? ???????? ?????????:
news:2c******** *************** *@posting.googl e.com...

You may want to look at the DB2_REDUCED_OPT IMIZATION registry
variable. Set it to an integer value, and DB2 drops the optimization
level down for dynamic queries joining more tables than specified in
that integer. Very useful.

Nov 12 '05 #8

Similar topics

2758

Multiple Table Joins Makes Query Go To Sleep

by: Ed_No_Spam_Please_Weber | last post by:

Hello All & Thanks in advance for your help! Background: 1) tblT_Documents is the primary parent transaction table that has 10 fields and about 250,000 rows 2) There are 9 child tables with each having 3 fields each, their own PK; the FK back to the parent table; and the unique data for that table. There is a one to many relation between the parent and each of the 9 child rows. Each child table has between 100,000 and 300,000

Microsoft SQL Server

31549

Can someone help me with multiple "Left Outer Joins"?

by: Steve | last post by:

I have a SQL query I'm invoking via VB6 & ADO 2.8, that requires three "Left Outer Joins" in order to return every transaction for a specific set of criteria. Using three "Left Outer Joins" slows the system down considerably. I've tried creating a temp db, but I can't figure out how to execute two select commands. (It throws the exception "The column prefix 'tempdb' does not match with a table name or alias name used in the query.")

Microsoft SQL Server

2638

Self join to multiple references...

by: Christopher Harrison | last post by:

Is there a way to store an indefinite number of keys in one field and self join them? Imagine, for example, you have a table of users with a "friends" column. Say user 1 is friends with users 9, 7, 24 and 6; then would it be possible to store this array/list/whatever-you-want-to-call-it into user 1's friends cell and extract the join in a query, some how? An easy and obvious way, that is; I can think of string processing hacks ;) Thanks...

MySQL Database

2716

IN [ODBC with multiple tables

by: Randy Harris | last post by:

I have been using "IN" to query tables in Oracle from MS Access very successfully. Select Field FROM MyTable IN [ODBC...etc Works great if there is only one table involved. Anyone know how I can use the same technique with multiple tables (and be certain that the join occurs at the server). --

Microsoft Access / VBA

2221

Application Design, <SELECT> Multiple or Single Table?

by: rayone | last post by:

Hi folks. I need advice. 2 options, which do you think is the better option to display/retrieve/report on the data. Keep in mind reporting (Crystal), SQL Performance, VB Code, usability, architecture. Case 1: On a web page I would like to render a dropdown list

C# / C Sharp

2062

one SQL server, multiple databases, one asp web project

by: DKode | last post by:

I am preparing to build an app that will pull data from multiple databases on the same sql server. For performance sake, is there a more "effecient" way to grab data from multiple databases at once? I've read a couple of threads and noone really seemed to give a straight forward answer. I would imagine in a SP in tsql, that I could just specify the database.owner.table and it would be the most effecient, just checking my options. ...

ASP.NET

12258

When execute dynamic generated multiple OPENQUERY statements (which linkes to DB2) in SQLServer, I always got SQL1040N The maximum number of applications is already connected to the database. SQLSTATE=57030.

by: alingsjtu | last post by:

Hello, every body. When execute dynamic generated multiple OPENQUERY statements (which linkes to DB2) in SQLServer, I always got SQL1040N The maximum number of applications is already connected to the database. SQLSTATE=57030. Background: I created a linked server to DB2 8.1 database which called GRR_DB2Server. In my stored procedure p_FetchRawData, I need to read some data from this linked server GRR_DB2Server and insert them into

DB2 Database

13181

Performance between Standard Join and Inner Join

by: Chamnap | last post by:

Hello, everyone I have one question about the standard join and inner join, which one is faster and more reliable? Can you recommend me to use? Please, explain me... Thanks Chamnap

Microsoft SQL Server

198

Re: raising an exception when multiple inheritance involves samebaseThank

by: Scott David Daniels | last post by:

Here are some tweaks on both bits of code: Paul McGuire wrote: .... m = for b in bases: if hasattr(b, '__mro__'): if MetaHocObject.ho in b.__mro__: m.append(b) if m:

Python

8363

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...

Windows Server

8883

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...

C / C++

8561

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...

Windows Server

8645

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...

General

7389

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...

Career Advice

6203

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...

Microsoft Access / VBA

4200

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...

Networking - Hardware / Configuration

4372

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET

1778

Comprehensive Guide to Website Development in Toronto: Expert Insights from BSMN Consultancy

by: bsmnconsultancy | last post by:

In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

General