about multiple fact tables and star schema

jane

HI,

I had a special senario here..

I need to build a database which only around 50-70GB finally.

But one of table would be 40GB finally. ( grow every year until to the
retension period)

rest of the tables are very small . (MB size level)

I'm thinking put the database on one partition, but had some query
performance concern on
that big table.

I'm going to break down the table by the time. But had one concern. It
is data warehouse environoment, Currently, this big table is a fact
table. it is a star-schema model. If I build a view based on the break
down tables, is that any impact on the optimizer to use star schema
model to choose access plan?

Aug 21 '06 #1

Subscribe Post Reply

4838

Serge Rielau

Jane,

Which version of DB2, which platform?
In DB2 9 you can easily use range partitioning.
In both DB2 V8 and DB2 multi dimensional clustering may apply.

UNION ALL view would be my last choice.
Even 4KB Pages give you up to 64GB on DB2 V8.
How much headroom do you need?

Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

IOD Conference
http://www.ibm.com/software/data/ond...ness/conf2006/

Aug 21 '06 #2

jane

Serge Rielau wrote:

Jane,

Which version of DB2, which platform?
In DB2 9 you can easily use range partitioning.
In both DB2 V8 and DB2 multi dimensional clustering may apply.

UNION ALL view would be my last choice.
Even 4KB Pages give you up to 64GB on DB2 V8.
How much headroom do you need?

Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

IOD Conference
http://www.ibm.com/software/data/ond...ness/conf2006/

Thanks, Serge.

Probably I did not explain my question clearly.

My environment is UDB V8 on AIX.

My concern is not table size limit. Because now I know the Max size
for the big table is 40GB. It is OK on single partition, even for 4K
page.

My concern is on query performance. I know if using multi-partition, it
must be resolved.
but the other tables are all very small, and the total DB size is not
big (50GB), based on the DB size, I want to use single partition
database. Only because of this big table to use multi-partiton , seems
a little bit waste..

That's why I want to break down the big table to relatively smaller
ones. Use union all view.

One of my colleague remind me this would change optimizer to not use
star schema to get good access plan..
(because currently , the big table is fact table, it is in star schema
model)
I'm not sure if this is the case..

Aug 21 '06 #3

Serge Rielau

Make your fact table a multi-dimensional-cluster.
That's ll do. I agree that DPF would be overkill.
If you don't have a low cardinality column that offers itself up for
MDC, just add another column e.g.
yearmonth GENERATED ALWAYS AS (INTEGER(date)/100)
Then cluster on the yearmonth column.
If you have 3 years worth of data this would partition the table into
3 * 12 =36 slices.

Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

IOD Conference
http://www.ibm.com/software/data/ond...ness/conf2006/

Aug 21 '06 #4

jane

Serge Rielau wrote:

Make your fact table a multi-dimensional-cluster.
That's ll do. I agree that DPF would be overkill.
If you don't have a low cardinality column that offers itself up for
MDC, just add another column e.g.
yearmonth GENERATED ALWAYS AS (INTEGER(date)/100)
Then cluster on the yearmonth column.
If you have 3 years worth of data this would partition the table into
3 * 12 =36 slices.

Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

IOD Conference
http://www.ibm.com/software/data/ond...ness/conf2006/

Serge

Thanks a lot!

I did not realize I could use MDC...

Aug 22 '06 #5

sethwai

Serge,

Your response brings up a question. How will the advantages of MDC
tables be leveraged when the clustering column is not related to the
actual data relationships and therefore will not be referenced in any
predicates?

Thanks.

Lew

Serge Rielau wrote:

Make your fact table a multi-dimensional-cluster.
That's ll do. I agree that DPF would be overkill.
If you don't have a low cardinality column that offers itself up for
MDC, just add another column e.g.
yearmonth GENERATED ALWAYS AS (INTEGER(date)/100)
Then cluster on the yearmonth column.
If you have 3 years worth of data this would partition the table into
3 * 12 =36 slices.

Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

IOD Conference
http://www.ibm.com/software/data/ond...ness/conf2006/

Aug 22 '06 #6

Serge Rielau

se*****@yahoo.com wrote:

Your response brings up a question. How will the advantages of MDC
tables be leveraged when the clustering column is not related to the
actual data relationships and therefore will not be referenced in any
predicates?

It won't. That's the same for UNION ALL, or range clustering.
If there is no way for the compiler or the run time engine to eliminate
ranges then its' useless.
If you are looking at parallelism note that SMP doesn't operate on a per
range level. Instead it will start reading all the ranges in all the
threads and sub divide within the range.
This is different from DPF where each node will read it's database
partition exclusively.

Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

IOD Conference
http://www.ibm.com/software/data/ond...ness/conf2006/

Aug 22 '06 #7

by: Justin | last post by:

Hi, In the process of localizing the 'regions' table, we added three new tables. The localized data will be stored in the TokenKeys and TokenValues tables. It would be easier if we did away with...

Microsoft SQL Server

XML conforming to multiple Schemas simultaneously

by: Alex Rootham | last post by:

Hi, I'm quite new to schema design, and was wondering if someone might point out what I'm doing wrong when designing my Schema's I have a need to do the following: Schema1.xsd -- defines...

.NET Framework

schema extraction for multiple interdependent databases

by: RugbyCoach | last post by:

I'm looking for a tool that can extract only the schema from a database in a form that can be used to generate that schema in another empty database. This is to facilitate our disaster recovery...

Microsoft SQL Server

row level security across multiple columns

by: gregory.sharrow | last post by:

I need to secure a datawarehouse table at the row level based on 1 to many keys on that table. A user should only see the rows they have access to. I need to be able to figure out which rows they...

Microsoft SQL Server

Questions about DB2

by: Firehawk® | last post by:

Hi, I have a great experience with DB (Oracle and others) and I am using DB2 now. Then, I have few questions: 1. How drop a schema (user) and all objects of this schema, like a Oracle "drop...

DB2 Database

about cursor load

by: jane | last post by:

I'm going to use cursor load to load 200GB data in my production database. My database has 2 partitions. but I cannot find more info in the manual about this cursor load. I'm concern about...

DB2 Database

Best way to model hybrid single/multiple logical database

by: Steve Jorgensen | last post by:

I've run across this issue several times of late, and I've never come up with a satisfactory answer to the best way to handle this schema issue. You have a large section of schema in which a...

Microsoft Access / VBA

Save FormSettings of MULTIPLE Forms in XML file!? (VB.NET)

by: DraguVaso | last post by:

Hi, I found some examples for storing the FormSettings of a Form in an XML-file, but none of these could match my criteria: What I am looking for is the possibility to save the FormSettings of...

Visual Basic .NET

Thoughts about SMS tablespaces

by: alexhguerra | last post by:

Hello If im not missing something, when looking into directories that are SMS containers i can see multiple files, one for table data, one for indexes and other for lobs. Is there any special...

DB2 Database

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

about multiple fact tables and star schema

Similar topics