469,648 Members | 1,504 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,648 developers. It's quick & easy.

Very large number of rows

We are busy designing a generic analytical system at work that will
hold multiple analytic types over time. This system is being developed
in SQL 2000.

Example of table
IDENTITY int
ItemId int [PK]
AnalyticType int [PK]
AnalyticDate DateTime [PK]
Value numeric(28,15)

ItemId - the item for which the analytic is being stored
AnalyticType - an arbitrary type

The [PK] tag indicates the composite primary key.

Our scenario is the following:
* For this time series data, we expect around 250 days per year
(working days) and the dataset could extend to over 20 years
* Up to 50 analytic types
* Up to 20,000 items

Looking at the combined calculation - this comes to roughly something
like
25 * 20,000 * 50 * 250 or around 5 billion rows.

We will be inserting around 50*20,000 or around 1 million rows each day
(the inserts will take place in the middle of the night (outside the
main query time) - this could be done through something like BCP or
BULK INSERT.

Our real problem is we have not previously worked with such large
tables before and are nervous that our system is going to grind to a
halt. Our biggest tables are around 20 million rows at the moment.

Scanning through google and microsoft's own site we have found a
parititioning method that is available.
http://www.microsoft.com/resources/d...rt5/c1861.mspx
Having experimented with the above system it seems rather quirky and
looking at the available literature it seems that this is not more
effective than a clustered index as far as queries go.

It needs to be optimized for queries like:
Given the ItemID and the AnalyticType search for a specific date or a
specific range of dates.

If anyone has any experience or helpful suggestions I would really
appreciate it.

Thanks

A

Jul 23 '05 #1
4 3245
The design flaw is called EAV (Entity-Attribute-Value) and consists of
mixing data and metadata in an application. My experience with it is
that it falls part in about one year or production work. SQL was
never meant to do this. I would look for another tool, probably a stat
package with time series functions in it. Also consider Stonebreaker's
Bitstream database, Kx or CQL from Stanford for streaming data.

Jul 23 '05 #2
You call this a "generic analytical system" but it doesn't look like a
very meaningful data model at all to me. I assume this is intended to
be a Data Warehouse? If so, are your "analytic types" really different
kinds of measures? Are you intending to use Analysis Services?

I don't undestand the point of your design from the information you've
given. If you haven't already then you should study some of the
literature on data warehousing techniques (Kimball, Inmon, etc) as
mistakes at this level can be very costly.

--
David Portas
SQL Server MVP
--

Jul 23 '05 #3
Can you enlarge on the issues with EAV? (okay, I'm sure it's in one of your
books......).

"--CELKO--" <jc*******@earthlink.net> wrote in message
news:11*********************@g14g2000cwa.googlegro ups.com...
The design flaw is called EAV (Entity-Attribute-Value) and consists of
mixing data and metadata in an application. My experience with it is
that it falls part in about one year or production work. SQL was
never meant to do this. I would look for another tool, probably a stat
package with time series functions in it. Also consider Stonebreaker's
Bitstream database, Kx or CQL from Stanford for streaming data.

Jul 23 '05 #4
>> Can you enlarge on the issues with EAV? (okay, I'm sure it's in one
of your books... <<

For fun, take a small schema like Northwind or the usual
order-orderdetails-inventory-customer databases. Convert that schema
into an EAV design. How do make sure that every entity has all of its
attributes? Try to add CHECK() constraints and DEFAULT values. try to
do an insert, update or delete in one statement. Try to write a simple
query, like "How many people ordered peanut butter and jelly?" Make a
typo in a name and create a new attribute. Try to write a grabage
keeping routine to get rid of orphans.

Jul 23 '05 #5

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

13 posts views Thread by StealthBananaT | last post: by
2 posts views Thread by shsandeep | last post: by
2 posts views Thread by =?Utf-8?B?UHJpeWE=?= | last post: by
1 post views Thread by datapro01 | last post: by
5 posts views Thread by Henry J. | last post: by
reply views Thread by gheharukoh7 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.