473,325 Members | 2,774 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,325 software developers and data experts.

On the subject of Data Warehouses, Data Cubes & OLAP….

On the subject of Data Warehouses, Data Cubes & OLAP….

I would like to speak frankly about Data Warehouses, Data Cubes and
OLAP (on-line analytical processing). Has it dawned on anyone else
that these buzz words were created by some geek who decided to take a
stab at marketing? Knowing that to the backwoods manager who knows
little of technology that new innovative names for old concepts would
help to sale their products.

I mean seriously, what is the story here? In a nut shell, and please
stop me if you disagree, but isn’t a data warehouse simply a
database? Can’t you do everything on a conventional database
like SQL Server, Oracle or DB2 that you can do on these new
proprietary Data Warehouse constructs? I mean who are they trying to
fool?

Take a look, for instance, at Data Cubes. Who hasn’t noticed
the striking similarity between data cubes and views used in all the
more robust databases? Also, what about OLAP? OLAP is nothing more
than a report generator. There’s nothing you can do with these
million dollar price tagged Data Warehouse total solution packages
that I can’t do with SQL Server, Oracle or DB2…for that
matter Microsoft Access.

As an example some sales people for Metadata Corporation has the Vice
President of I.T. in Nashville, for Healthspring, sold on their total
solution data respository which is such a scam. All they had to do
was throw a couple of buzzwords at him and they have him hypnotized.

Personally, I feel that these kinds of marketing practices undermine
our industry. It helps to unravel what little standards or
consistency we have. What do you guys think?

Stuart
Nov 12 '05 #1
8 1455
Hi,
it is true one could do most of the data warehousing/minning in a
regular relational database, indeed some of the products that do data
analysis are utilizing relational DB backends. However there is a lot
more to OLAP then is in relational DB.

First there is the volume of data this is why you need the cubes.
Consider you need to create reports on the performance of an
enterprise - the data amounts ~150 Mb per hour. So you periodically
sum it in different dimensions georgraphical, time etc. (this is done
in a persistent manner, unlike views in relational DB's, so you have
this data available immediately not after an hour of disk scratching).

Next the reports true these are simple things at first sight, but the
maths (statistics, complex polinomials) usually embedded in OLAP
products are impressive.

So true OLAP looks much like traditional DB, but doing OLAP with
regular DB is a lot of head banging. Without the tools - If you get
the maths right probably most of the things will be so slow it will
take a lifetime to see results or the other way around.
Nov 12 '05 #2
"Will" <st**********@synovusmortgage.com> wrote in message

Personally, I feel that these kinds of marketing practices undermine
our industry. It helps to unravel what little standards or
consistency we have. What do you guys think?


We can certainly jump out and say that "data warehousing" is some cool
marketing hype. However, one could also say the same thing about using a
relational database. Fact is, many of the concepts of relational database
and the concepts of data normalizing are extremely important.

The same goes for data warehousing. For example, you might want to do some
tracking on some products sold. However, you can't always put millions and
millions of records in to a huge database and then simply start running
quires on that huge dataset. It is going to be too large to work with, and
too many records need to be processed to get your information in a quick
manor. So, for example to track product sales, you might SUMMARIZE the data
into a data cube. Your 3rd dimension might only contain the last 2, or 3
months worth of transactions (sales over time). Thus, there is LOT of work
and decision making as to how much detail data you will put into the data
warehouse. Further, often much of the data you put into the data warehouse
is SUMMARY data (ie: totals of sales of product over given amount of time).
A typical complex system will often easily have 40, or 50 tables of related
data. When you move the results to a data warehouse, you are de-nominating
the data into summary tables. You may not even store which customer bought
which product, but only stuff like sales per day, and perhaps by region.
Often those resulting tables will not need a whole bunch of joined sql data
tables to get your results (they will HAVE BEEN ALREADY summarised and
totalled BEFORE you start asking questions about the data).

Only the analyst who has a good deal of experience on the kind of questions
(queries) to asked will really have a good idea as to how much data will go
into the warehouse. As mentioned, you often can't even begin to put every
single transaction into that data warehouse. A lot of planning will have to
go into how much drill down of details you want to keep.

So, while there might be buzz words like a "data warehouse", it is certainly
not just marketing hype but is result that we can't always query the live
production database fast enough to get the kinds of answers we need on a
daily bases (ie: just how much should you spend on radio and tv advertising
this week?).
--
Albert D. Kallal
Edmonton, Alberta Canada
ka****@msn.com
http://www.attcanada.net/~kallal.msn

Nov 12 '05 #3
> So true OLAP looks much like traditional DB, but doing OLAP with
regular DB is a lot of head banging. Without the tools - If you get
the maths right probably most of the things will be so slow it will
take a lifetime to see results or the other way around.


It is my experience OLAP is nothing more than a report generator using
views. Anyone wants to argue with me? I've done everything you have
mentioned with a great deal of success using SQL Server and Crystal
Reports. Having created all the input to the reports using views and
unions I've made some damn impressive reports. The views function
identical to cubes. You might argue that this consumes resources
because they are generated on the fly and could hamper traffic, but
I've never experienced this. The perfect DW model in my humble
opinion is snowflake/normalized on a SQL Server (robust) box.
Attention to detail and elegant views yield a most powerful EIS.
Arguments?

The statistical calculations are easy to put together. More
importantly...IT'S THE MOST FUN PART!!! CAN I GET AN AMEN?

While I'm on my soapbox has anyone noticed the conceptual similarities
of star/snowflake schema models and that of non-realational/relational
or unmormalized/normalized schema models respectively??? GIVE ME A
BREAK! Why does anyone CHOOSE to create new names for old concepts?
I think they should be drug out into the street and shot! I CAN'T
HEAR YOU CONGREGATION!

Stu
Nov 12 '05 #4
Bob
"Will" <st**********@synovusmortgage.com> wrote in message
news:4e**************************@posting.google.c om...
<snip>
Why does anyone CHOOSE to create new names for old concepts?
I think they should be drug out into the street and shot!


drug?
Is this a new name for the old concept 'dragged'?

Bob
Nov 12 '05 #5
An arguement, in general, on Stuart's side. Here we keep talking
about millions and millions of records. I work with a small
manufacturing enterprise with annual revenue of about 50 million, they
actually use Access, data retrieval is slowish, imho, not because of
the volumne of data (about 65 MB), but because of unplanned db design.
After exporting the Access db to SQL Server, and normalized a bit,
man, data retrieval is almost light speed. They continue to use
existing Access programs like FORMS etc. while getting whatever data
they want from the SQL Server quickly and easily with a tool I
developed. Down the road they may dump the Access db, but that would
be later.
My point is, for large enterprises like Fortune 1000 something, the
conventional DW etc. may be of value to them if the implementation is
successful (I heard of a high rate of DW implementation failure)
however that may not be applicable to SMEs (small to medium sized
enterprises).
DL
http://www.hegelsoftware.com
"Albert D. Kallal" <NO************@msn.com> wrote in message news:<gW1jb.98779$9l5.63320@pd7tw2no>...
"Will" <st**********@synovusmortgage.com> wrote in message

Personally, I feel that these kinds of marketing practices undermine
our industry. It helps to unravel what little standards or
consistency we have. What do you guys think?


We can certainly jump out and say that "data warehousing" is some cool
marketing hype. However, one could also say the same thing about using a
relational database. Fact is, many of the concepts of relational database
and the concepts of data normalizing are extremely important.

The same goes for data warehousing. For example, you might want to do some
tracking on some products sold. However, you can't always put millions and
millions of records in to a huge database and then simply start running
quires on that huge dataset. It is going to be too large to work with, and
too many records need to be processed to get your information in a quick
manor. So, for example to track product sales, you might SUMMARIZE the data
into a data cube. Your 3rd dimension might only contain the last 2, or 3
months worth of transactions (sales over time). Thus, there is LOT of work
and decision making as to how much detail data you will put into the data
warehouse. Further, often much of the data you put into the data warehouse
is SUMMARY data (ie: totals of sales of product over given amount of time).
A typical complex system will often easily have 40, or 50 tables of related
data. When you move the results to a data warehouse, you are de-nominating
the data into summary tables. You may not even store which customer bought
which product, but only stuff like sales per day, and perhaps by region.
Often those resulting tables will not need a whole bunch of joined sql data
tables to get your results (they will HAVE BEEN ALREADY summarised and
totalled BEFORE you start asking questions about the data).

Only the analyst who has a good deal of experience on the kind of questions
(queries) to asked will really have a good idea as to how much data will go
into the warehouse. As mentioned, you often can't even begin to put every
single transaction into that data warehouse. A lot of planning will have to
go into how much drill down of details you want to keep.

So, while there might be buzz words like a "data warehouse", it is certainly
not just marketing hype but is result that we can't always query the live
production database fast enough to get the kinds of answers we need on a
daily bases (ie: just how much should you spend on radio and tv advertising
this week?).

Nov 12 '05 #6
I would argue that 65mb on Access shouldn't be overwhelming. Though
Access on a single processor PC is never going to perform. SQL Server
on a "built" box compared to Access is like comparing a Fierrari to a
Ford Focus.

SQL Server is truly ideal. Sad it's a Microsoft product. Let me tell
you that when you can tie in security to the DB through your local
network security software ("NT") then why would you go with anyother
product? If Oracle came out with their own proprietary network that
would meld with the security of their db or vise versa then Oracle
would be the ONLY solution! Oracle, can you hear me now?

Stu
qw********@boxfrog.com (Doug Baroter) wrote in message news:<fc**************************@posting.google. com>...
An arguement, in general, on Stuart's side. Here we keep talking
about millions and millions of records. I work with a small
manufacturing enterprise with annual revenue of about 50 million, they
actually use Access, data retrieval is slowish, imho, not because of
the volumne of data (about 65 MB), but because of unplanned db design.
After exporting the Access db to SQL Server, and normalized a bit,
man, data retrieval is almost light speed. They continue to use
existing Access programs like FORMS etc. while getting whatever data
they want from the SQL Server quickly and easily with a tool I
developed. Down the road they may dump the Access db, but that would
be later.
My point is, for large enterprises like Fortune 1000 something, the
conventional DW etc. may be of value to them if the implementation is
successful (I heard of a high rate of DW implementation failure)
however that may not be applicable to SMEs (small to medium sized
enterprises).
DL
http://www.hegelsoftware.com
"Albert D. Kallal" <NO************@msn.com> wrote in message news:<gW1jb.98779$9l5.63320@pd7tw2no>...
"Will" <st**********@synovusmortgage.com> wrote in message

Personally, I feel that these kinds of marketing practices undermine
our industry. It helps to unravel what little standards or
consistency we have. What do you guys think?


We can certainly jump out and say that "data warehousing" is some cool
marketing hype. However, one could also say the same thing about using a
relational database. Fact is, many of the concepts of relational database
and the concepts of data normalizing are extremely important.

The same goes for data warehousing. For example, you might want to do some
tracking on some products sold. However, you can't always put millions and
millions of records in to a huge database and then simply start running
quires on that huge dataset. It is going to be too large to work with, and
too many records need to be processed to get your information in a quick
manor. So, for example to track product sales, you might SUMMARIZE the data
into a data cube. Your 3rd dimension might only contain the last 2, or 3
months worth of transactions (sales over time). Thus, there is LOT of work
and decision making as to how much detail data you will put into the data
warehouse. Further, often much of the data you put into the data warehouse
is SUMMARY data (ie: totals of sales of product over given amount of time).
A typical complex system will often easily have 40, or 50 tables of related
data. When you move the results to a data warehouse, you are de-nominating
the data into summary tables. You may not even store which customer bought
which product, but only stuff like sales per day, and perhaps by region.
Often those resulting tables will not need a whole bunch of joined sql data
tables to get your results (they will HAVE BEEN ALREADY summarised and
totalled BEFORE you start asking questions about the data).

Only the analyst who has a good deal of experience on the kind of questions
(queries) to asked will really have a good idea as to how much data will go
into the warehouse. As mentioned, you often can't even begin to put every
single transaction into that data warehouse. A lot of planning will have to
go into how much drill down of details you want to keep.

So, while there might be buzz words like a "data warehouse", it is certainly
not just marketing hype but is result that we can't always query the live
production database fast enough to get the kinds of answers we need on a
daily bases (ie: just how much should you spend on radio and tv advertising
this week?).

Nov 12 '05 #7
I would argue that 65mb on Access shouldn't be overwhelming. Though
Access on a single processor PC is never going to perform. SQL Server
on a "built" box compared to Access is like comparing a Fierrari to a
Ford Focus.

SQL Server is truly ideal. Sad it's a Microsoft product. Let me tell
you that when you can tie in security to the DB through your local
network security software ("NT") then why would you go with anyother
product? If Oracle came out with their own proprietary network that
would meld with the security of their db or vise versa then Oracle
would be the ONLY solution! Oracle, can you hear me now?

Stu
qw********@boxfrog.com (Doug Baroter) wrote in message news:<fc**************************@posting.google. com>...
An arguement, in general, on Stuart's side. Here we keep talking
about millions and millions of records. I work with a small
manufacturing enterprise with annual revenue of about 50 million, they
actually use Access, data retrieval is slowish, imho, not because of
the volumne of data (about 65 MB), but because of unplanned db design.
After exporting the Access db to SQL Server, and normalized a bit,
man, data retrieval is almost light speed. They continue to use
existing Access programs like FORMS etc. while getting whatever data
they want from the SQL Server quickly and easily with a tool I
developed. Down the road they may dump the Access db, but that would
be later.
My point is, for large enterprises like Fortune 1000 something, the
conventional DW etc. may be of value to them if the implementation is
successful (I heard of a high rate of DW implementation failure)
however that may not be applicable to SMEs (small to medium sized
enterprises).
DL
http://www.hegelsoftware.com
"Albert D. Kallal" <NO************@msn.com> wrote in message news:<gW1jb.98779$9l5.63320@pd7tw2no>...
"Will" <st**********@synovusmortgage.com> wrote in message

Personally, I feel that these kinds of marketing practices undermine
our industry. It helps to unravel what little standards or
consistency we have. What do you guys think?


We can certainly jump out and say that "data warehousing" is some cool
marketing hype. However, one could also say the same thing about using a
relational database. Fact is, many of the concepts of relational database
and the concepts of data normalizing are extremely important.

The same goes for data warehousing. For example, you might want to do some
tracking on some products sold. However, you can't always put millions and
millions of records in to a huge database and then simply start running
quires on that huge dataset. It is going to be too large to work with, and
too many records need to be processed to get your information in a quick
manor. So, for example to track product sales, you might SUMMARIZE the data
into a data cube. Your 3rd dimension might only contain the last 2, or 3
months worth of transactions (sales over time). Thus, there is LOT of work
and decision making as to how much detail data you will put into the data
warehouse. Further, often much of the data you put into the data warehouse
is SUMMARY data (ie: totals of sales of product over given amount of time).
A typical complex system will often easily have 40, or 50 tables of related
data. When you move the results to a data warehouse, you are de-nominating
the data into summary tables. You may not even store which customer bought
which product, but only stuff like sales per day, and perhaps by region.
Often those resulting tables will not need a whole bunch of joined sql data
tables to get your results (they will HAVE BEEN ALREADY summarised and
totalled BEFORE you start asking questions about the data).

Only the analyst who has a good deal of experience on the kind of questions
(queries) to asked will really have a good idea as to how much data will go
into the warehouse. As mentioned, you often can't even begin to put every
single transaction into that data warehouse. A lot of planning will have to
go into how much drill down of details you want to keep.

So, while there might be buzz words like a "data warehouse", it is certainly
not just marketing hype but is result that we can't always query the live
production database fast enough to get the kinds of answers we need on a
daily bases (ie: just how much should you spend on radio and tv advertising
this week?).

Nov 12 '05 #8
st**********@synovusmortgage.com (Will) wrote in message news:<4e**************************@posting.google. com>...
I would argue that 65mb on Access shouldn't be overwhelming. The thing is, as I mentioned in last posting, the db was not well
designed, it has over 100 user tables, some not even being used at all
but they are not too clear as to which ones EXACTLY, and some are
linked table to external db, anyway, it's sort of snow-balling.
Though
Access on a single processor PC is never going to perform. SQL Server
on a "built" box compared to Access is like comparing a Fierrari to a
Ford Focus.

SQL Server is truly ideal. I like it too and I believe it is a good fit for lots of SMEs.
Sad it's a Microsoft product. I don't necessarily dislike MS.
Let me tell
you that when you can tie in security to the DB through your local
network security software ("NT") then why would you go with anyother
product? Yes, MS SQL Server has two security mechanisms. Though I'm not a
heavy Oracle user, my impression is Oralce is stronger than MS SQL
Server in this dept.
If Oracle came out with their own proprietary network that
would meld with the security of their db or vise versa then Oracle
would be the ONLY solution! Oracle, can you hear me now?

Stu
qw********@boxfrog.com (Doug Baroter) wrote in message news:<fc**************************@posting.google. com>...
An arguement, in general, on Stuart's side. Here we keep talking
about millions and millions of records. I work with a small
manufacturing enterprise with annual revenue of about 50 million, they
actually use Access, data retrieval is slowish, imho, not because of
the volumne of data (about 65 MB), but because of unplanned db design.
After exporting the Access db to SQL Server, and normalized a bit,
man, data retrieval is almost light speed. They continue to use
existing Access programs like FORMS etc. while getting whatever data
they want from the SQL Server quickly and easily with a tool I
developed. Down the road they may dump the Access db, but that would
be later.
My point is, for large enterprises like Fortune 1000 something, the
conventional DW etc. may be of value to them if the implementation is
successful (I heard of a high rate of DW implementation failure)
however that may not be applicable to SMEs (small to medium sized
enterprises).
DL
http://www.hegelsoftware.com
"Albert D. Kallal" <NO************@msn.com> wrote in message news:<gW1jb.98779$9l5.63320@pd7tw2no>...
"Will" <st**********@synovusmortgage.com> wrote in message

>
> Personally, I feel that these kinds of marketing practices undermine
> our industry. It helps to unravel what little standards or
> consistency we have. What do you guys think?
>

We can certainly jump out and say that "data warehousing" is some cool
marketing hype. However, one could also say the same thing about using a
relational database. Fact is, many of the concepts of relational database
and the concepts of data normalizing are extremely important.

The same goes for data warehousing. For example, you might want to do some
tracking on some products sold. However, you can't always put millions and
millions of records in to a huge database and then simply start running
quires on that huge dataset. It is going to be too large to work with, and
too many records need to be processed to get your information in a quick
manor. So, for example to track product sales, you might SUMMARIZE the data
into a data cube. Your 3rd dimension might only contain the last 2, or 3
months worth of transactions (sales over time). Thus, there is LOT of work
and decision making as to how much detail data you will put into the data
warehouse. Further, often much of the data you put into the data warehouse
is SUMMARY data (ie: totals of sales of product over given amount of time).
A typical complex system will often easily have 40, or 50 tables of related
data. When you move the results to a data warehouse, you are de-nominating
the data into summary tables. You may not even store which customer bought
which product, but only stuff like sales per day, and perhaps by region.
Often those resulting tables will not need a whole bunch of joined sql data
tables to get your results (they will HAVE BEEN ALREADY summarised and
totalled BEFORE you start asking questions about the data).

Only the analyst who has a good deal of experience on the kind of questions
(queries) to asked will really have a good idea as to how much data will go
into the warehouse. As mentioned, you often can't even begin to put every
single transaction into that data warehouse. A lot of planning will have to
go into how much drill down of details you want to keep.

So, while there might be buzz words like a "data warehouse", it is certainly
not just marketing hype but is result that we can't always query the live
production database fast enough to get the kinds of answers we need on a
daily bases (ie: just how much should you spend on radio and tv advertising
this week?).

Nov 12 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: Will | last post by:
On the subject of Data Warehouses, Data Cubes & OLAP…. I would like to speak frankly about Data Warehouses, Data Cubes and OLAP (on-line analytical processing). Has it dawned on anyone else...
7
by: Will | last post by:
On the subject of Data Warehouses, Data Cubes & OLAP…. I would like to speak frankly about Data Warehouses, Data Cubes and OLAP (on-line analytical processing). Has it dawned on anyone else...
3
by: Jim Higson | last post by:
Does anyone know a technique in javascript to transform from (for example) &hearts; to the char '♥'? I'm doing this because I have to interpret some data I got over XHTMLHTTP that isn't XML,...
2
by: Dman | last post by:
Having trouble in Access XP. I want to display the customers name, city and phone number in a Combo Box but the but the formatting is lost – eg (123) 456-7890 is displayed as 1234567890. Any...
3
by: John Nagle | last post by:
I have XML replies in a DOM which contain entity escapes, like "&amp;". What's the proper way to replace them with the ordinary characters? Preferably something that will work in most browsers? I...
3
by: yuanyun.ken | last post by:
hi,dear all js gurus. In my app, server responses some text like: '&nbsp;&nbsp; ' and I need display these content in textarea. But Html would convert specail characters to space, and ignore line...
0
gsgurdeep
by: gsgurdeep | last post by:
Today One Question is asked to me in interview for a software programmer post. Please tell me the answer of this question. Question is given bellow…………………….
17
by: malathib | last post by:
Hi, I have used a rich text editor in my application. I got it from one of the site.all are JS files. My problem is, i want to call a javascript function from the iframe. Can anybody help...
5
matheussousuke
by: matheussousuke | last post by:
Hello, I'm using tiny MCE plugin on my oscommerce and it is inserting my website URL when I use insert image function in the emails. The goal is: Make it send the email with the URL...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.