473,666 Members | 2,617 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Multiple join performance

Hi, to everybody,

let's consider this scenario: you have 1 data-table and 10
dictionary-tables; the data-table has 5 million records and 30 columns, 10
of these columns have a foreign-key to the dictionary-tables: the
dictionary-tables have (almsot all) only two columns (code and description)
and a low number of records (less than 100 in most cases, but one table has
1000 records and another one has 8000 records). When you perform a query on
the data-table you must show the descriptions taken from the
dictionary-tables; you have two options to do this:

1) A multiple join (11 tables) to get both main data and descriptions.
2) Load permanently the dictionary-tables in memory (using hashmaps) and
query only the data-table, then the application looks-up the descriptions
from the hashmaps. Consider that you are writing a web application, so the
hashmaps can be hold in application scope to be used by all the users (more
than 1000 users).

Which solution performs better? I think this scenario is rather common.
I'm a little afraid of doing a multiple join on a table that has 5 millions
of records.

Tell your opinion.
Pino
Nov 12 '05 #1
7 5834
Pino,

you can do the same within DB2.
Keep the tables in a dedicated bufferpool.
Let DB2 figure out whether it wants to use hashjoins or something else.

The rule of thumb is: If it's relational let the DBMS deal with it.

Cheers
Serge

--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab
Nov 12 '05 #2
I have another experience with w2k, v7, fp10a.
Some times ago I wrote a select statement on 6-7 tables with 200 000 - 300
000 rows and 9-10 dictionary tables with a few number of rows. This query
always returned 1 row (it was a contract information) and lasted about 1
minute. Most of this time was taken for a compilation (we used query
optimization 3). And I had to write SQL stored procedure where I splited
this query on 2 parts: first I populate global temporary table with select
only over 6-7 large tables and second I join this temporary table with
dictionary tables. This SQL SP executed about 5 seconds...
Conclusion: DB2 optimizer - great thing in common, but sometimes...

Best regards,
Mark.
Keep the tables in a dedicated bufferpool.
Let DB2 figure out whether it wants to use hashjoins or something else.

The rule of thumb is: If it's relational let the DBMS deal with it.


Nov 12 '05 #3
Mark,

Why did you use dynamic SQL?
If you could use a stored proc for the piecemeal you could equally have
written a proc with the whole join in it if you don't want to deal with
packages and the query doesn't execute often enough to stay in the cache.

Cheers
Serge
--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab
Nov 12 '05 #4
> Why did you use dynamic SQL?

Sure, first of all I wrote this SP with static select, but it worked
considerable slower (as I remember ~ 30 sec opposite 3-5 sec), than this
splitted dynamic... (I didn't understand why and decided to use dynamic).
Nov 12 '05 #5
You may want to look at the DB2_REDUCED_OPT IMIZATION registry
variable. Set it to an integer value, and DB2 drops the optimization
level down for dynamic queries joining more tables than specified in
that integer. Very useful.

/T

"Mark Barinstein" <ma**@crk.vsi.r u> wrote in message news:<ca******* ****@serv3.vsi. ru>...
I have another experience with w2k, v7, fp10a.
Some times ago I wrote a select statement on 6-7 tables with 200 000 - 300
000 rows and 9-10 dictionary tables with a few number of rows. This query
always returned 1 row (it was a contract information) and lasted about 1
minute. Most of this time was taken for a compilation (we used query
optimization 3). And I had to write SQL stored procedure where I splited
this query on 2 parts: first I populate global temporary table with select
only over 6-7 large tables and second I join this temporary table with
dictionary tables. This SQL SP executed about 5 seconds...
Conclusion: DB2 optimizer - great thing in common, but sometimes...

Best regards,
Mark.
Keep the tables in a dedicated bufferpool.
Let DB2 figure out whether it wants to use hashjoins or something else.

The rule of thumb is: If it's relational let the DBMS deal with it.

Nov 12 '05 #6
I would use 1. I strongly suspect that 2 would be slower. It definitely
takes longer to code, and it fails to handle "concurrent " inserts and
updates to the dictionary tables (your hashmap can become out-of-date), so
that it would also be the source of various problems.

If join performance were to become a problem - which is not likely in your
simple join scenario - then you can resolve this by using a materialized
query table (see CREATE TABLE in the SQL reference).

"Pino" <no****@novirus .invalid> wrote in message
news:gn******** ************@tw ister2.libero.i t...
Hi, to everybody,

let's consider this scenario: you have 1 data-table and 10
dictionary-tables; the data-table has 5 million records and 30 columns, 10
of these columns have a foreign-key to the dictionary-tables: the
dictionary-tables have (almsot all) only two columns (code and description) and a low number of records (less than 100 in most cases, but one table has 1000 records and another one has 8000 records). When you perform a query on the data-table you must show the descriptions taken from the
dictionary-tables; you have two options to do this:

1) A multiple join (11 tables) to get both main data and descriptions.
2) Load permanently the dictionary-tables in memory (using hashmaps) and
query only the data-table, then the application looks-up the descriptions
from the hashmaps. Consider that you are writing a web application, so the hashmaps can be hold in application scope to be used by all the users (more than 1000 users).

Which solution performs better? I think this scenario is rather common.
I'm a little afraid of doing a multiple join on a table that has 5 millions of records.

Tell your opinion.
Pino

Nov 12 '05 #7
Tomas,

I don't like optimization levels less than 3. It often leads to stupid query
plans for queries with 5-6 quite large tables... Besides our system is both
OLTP and DSS, and I think we cant't use this variable...

Best regards,
Mark.

"Tomas Hallin" <ja*****@hotmai l.com> ???????/???????? ? ???????? ?????????:
news:2c******** *************** *@posting.googl e.com...
You may want to look at the DB2_REDUCED_OPT IMIZATION registry
variable. Set it to an integer value, and DB2 drops the optimization
level down for dynamic queries joining more tables than specified in
that integer. Very useful.


Nov 12 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
2758
by: Ed_No_Spam_Please_Weber | last post by:
Hello All & Thanks in advance for your help! Background: 1) tblT_Documents is the primary parent transaction table that has 10 fields and about 250,000 rows 2) There are 9 child tables with each having 3 fields each, their own PK; the FK back to the parent table; and the unique data for that table. There is a one to many relation between the parent and each of the 9 child rows. Each child table has between 100,000 and 300,000
7
31549
by: Steve | last post by:
I have a SQL query I'm invoking via VB6 & ADO 2.8, that requires three "Left Outer Joins" in order to return every transaction for a specific set of criteria. Using three "Left Outer Joins" slows the system down considerably. I've tried creating a temp db, but I can't figure out how to execute two select commands. (It throws the exception "The column prefix 'tempdb' does not match with a table name or alias name used in the query.")
6
2638
by: Christopher Harrison | last post by:
Is there a way to store an indefinite number of keys in one field and self join them? Imagine, for example, you have a table of users with a "friends" column. Say user 1 is friends with users 9, 7, 24 and 6; then would it be possible to store this array/list/whatever-you-want-to-call-it into user 1's friends cell and extract the join in a query, some how? An easy and obvious way, that is; I can think of string processing hacks ;) Thanks...
11
2716
by: Randy Harris | last post by:
I have been using "IN" to query tables in Oracle from MS Access very successfully. Select Field FROM MyTable IN [ODBC...etc Works great if there is only one table involved. Anyone know how I can use the same technique with multiple tables (and be certain that the join occurs at the server). --
0
2221
by: rayone | last post by:
Hi folks. I need advice. 2 options, which do you think is the better option to display/retrieve/report on the data. Keep in mind reporting (Crystal), SQL Performance, VB Code, usability, architecture. Case 1: On a web page I would like to render a dropdown list
3
2062
by: DKode | last post by:
I am preparing to build an app that will pull data from multiple databases on the same sql server. For performance sake, is there a more "effecient" way to grab data from multiple databases at once? I've read a couple of threads and noone really seemed to give a straight forward answer. I would imagine in a SP in tsql, that I could just specify the database.owner.table and it would be the most effecient, just checking my options. ...
5
12258
by: alingsjtu | last post by:
Hello, every body. When execute dynamic generated multiple OPENQUERY statements (which linkes to DB2) in SQLServer, I always got SQL1040N The maximum number of applications is already connected to the database. SQLSTATE=57030. Background: I created a linked server to DB2 8.1 database which called GRR_DB2Server. In my stored procedure p_FetchRawData, I need to read some data from this linked server GRR_DB2Server and insert them into
12
13181
by: Chamnap | last post by:
Hello, everyone I have one question about the standard join and inner join, which one is faster and more reliable? Can you recommend me to use? Please, explain me... Thanks Chamnap
0
198
by: Scott David Daniels | last post by:
Here are some tweaks on both bits of code: Paul McGuire wrote: .... m = for b in bases: if hasattr(b, '__mro__'): if MetaHocObject.ho in b.__mro__: m.append(b) if m:
0
8363
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
8883
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
8561
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8645
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
7389
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6203
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
4200
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4372
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
1778
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.