By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,501 Members | 2,790 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,501 IT Pros & Developers. It's quick & easy.

Question about SQL table definitions

P: n/a
I think I know the answer to this, but suggestions welcomed.

I have to store lots of information. An electricity reading every half
hour for hundreds of meters for several years.

The ways I thought of storing this in SQL were:
1. One table row per day with 48 fields for the readings.
144M table, 3M index
saves space and having a long skinny table
however, getting the min/max reading in a day is a pain. I
think I'll have to store it which goes against the grain
2. One table row per reading.
376M table, 231M index
big time long skinny table with more control information than
content, huge waste of disk space. However it is first normal
form so queries are easier.
3. Two tables, one to hold control information and one for the readings.
added complications because of the join and creation of an
internal link field.

A quick trial proved 1 to be cumbersome but space saving. I'm currently
using option 2.

any suggestions or recommenndations?


Jul 17 '05 #1
Share this Question
Share on Google+
3 Replies


P: n/a
On Wed, 26 Nov 2003 14:48:58 +0000, Kevin Thorpe <ke***@pricetrak.com>
wrote:
I think I know the answer to this, but suggestions welcomed.

I have to store lots of information. An electricity reading every half
hour for hundreds of meters for several years.

The ways I thought of storing this in SQL were:
1. One table row per day with 48 fields for the readings.
144M table, 3M index
saves space and having a long skinny table
however, getting the min/max reading in a day is a pain. I
think I'll have to store it which goes against the grain
Ouch!
2. One table row per reading.
376M table, 231M index
big time long skinny table with more control information than
content, huge waste of disk space. However it is first normal
form so queries are easier.
3. Two tables, one to hold control information and one for the readings.
added complications because of the join and creation of an
internal link field.


It really ought to be option 3 (which is in 3NF), as you're
duplicating your control info with option 2. Indexing the date and
control id should help performance.

Depending on what kinds of reports you want to run, you may be able to
archive off old data with option 3 thereby keeping the database to a
manageable size. We do that with some of our customer's history
records whenever they notice the system becoming slow.

If most queries will concern, say, the previous week's readings you
could create a separate database for those and have a scheduled job
move the records into an archive database each day/week.

Are your controls grouped (say, geographically, by purpose, etc)? Will
you want to combine the readings of several controls?

--
David ( @priz.co.uk )
The Internet Prisoner Database: http://www.priz.co.uk/ipdb/
The Tarbrax Chronicle: http://www.tarbraxchronicle.com/
Jul 17 '05 #2

P: n/a
On Wed, 26 Nov 2003 14:48:58 +0000, Kevin Thorpe <ke***@pricetrak.com> wrote:
I think I know the answer to this, but suggestions welcomed.

I have to store lots of information. An electricity reading every half
hour for hundreds of meters for several years.

The ways I thought of storing this in SQL were:
1. One table row per day with 48 fields for the readings.
144M table, 3M index
saves space and having a long skinny table
however, getting the min/max reading in a day is a pain. I
think I'll have to store it which goes against the grain
2. One table row per reading.
376M table, 231M index
big time long skinny table with more control information than
content, huge waste of disk space. However it is first normal
form so queries are easier.
3. Two tables, one to hold control information and one for the readings.
added complications because of the join and creation of an
internal link field.

A quick trial proved 1 to be cumbersome but space saving. I'm currently
using option 2.

any suggestions or recommenndations?


What database are you implementing this on? If it were Oracle, it would look
like a prime candidate for an Index-Organised table, which would cut down your
storage requirements. I don't think MySQL has an equivalent. I vaguely recall
the SQL Server equivalent being a 'clustered index' or something like that.

--
Andy Hassall (an**@andyh.co.uk) icq(5747695) (http://www.andyh.co.uk)
Space: disk usage analysis tool (http://www.andyhsoftware.co.uk/space)
Jul 17 '05 #3

P: n/a
David Mackenzie wrote:
On Wed, 26 Nov 2003 14:48:58 +0000, Kevin Thorpe <ke***@pricetrak.com>
wrote:

I think I know the answer to this, but suggestions welcomed.

I have to store lots of information. An electricity reading every half
hour for hundreds of meters for several years.

The ways I thought of storing this in SQL were:
1. One table row per day with 48 fields for the readings.
144M table, 3M index
saves space and having a long skinny table
however, getting the min/max reading in a day is a pain. I
think I'll have to store it which goes against the grain

Ouch!

2. One table row per reading.
376M table, 231M index
big time long skinny table with more control information than
content, huge waste of disk space. However it is first normal
form so queries are easier.
3. Two tables, one to hold control information and one for the readings.
added complications because of the join and creation of an
internal link field.

It really ought to be option 3 (which is in 3NF), as you're
duplicating your control info with option 2. Indexing the date and
control id should help performance.


That's what I thought. I started off using option 2 because my control
information was fairly small (meter number, date, reading number) but
I've now had to add extra dimensions to it (some readings are estimated
or modelled) so a migration to two tables seems required.
Depending on what kinds of reports you want to run, you may be able to
archive off old data with option 3 thereby keeping the database to a
manageable size. We do that with some of our customer's history
records whenever they notice the system becoming slow.
I try not to archive information. Disk space is cheap and properly
indexed speed shouldn't be an issue. I've found empirically that the
databases I've used fill up only at the rate hardware speeds up.
If most queries will concern, say, the previous week's readings you
could create a separate database for those and have a scheduled job
move the records into an archive database each day/week.

Are your controls grouped (say, geographically, by purpose, etc)? Will
you want to combine the readings of several controls?


The control fields are orthogonal so indexing is important. Lots of
EXPLAIN queries I think.

Jul 17 '05 #4

This discussion thread is closed

Replies have been disabled for this discussion.