A project I'm thinking about, could use some advice from those who
understand the dimensional limits of an SQL-like DB.
Imagine, for example, a database containing detailed data for many
cities during thunderstorms. Each time there is a storm, data
collection begins, gathering "rain-drops-per millisecond" maybe 5 times
per second (don't worry about the sanity of this, its just an analogy).
The logistical data is no concern for me (cities, dates, storm duration
etc). I anticipate a few hundred cities and but also thousands of storm
events associated with those various cities.
However the storage of rain storm data is my question.
Should "rain-drop-rate" data be fields in a rain-drop-rate measurements
table, with columns of "rate-drop-rate" and a "storm event" foreign key
for the storm to which the event belongs? That seems like it would be
a poor use of DB resources, as there would be billions of entries in
that table after many hundreds or thousands of storms are captured.
The alternatives that come to mind are
a) each storm event has a file reference, and the file contains
thousands of pairs of time and rain-drop-rate. Then I have directories
full of files.
b) each storm event is a new table in the database with the columns of
"rain-drop-rate" and "time of measurement" (The database still gets
filled with billions of entries, but spread among thousands of tables)
Is there some other option I should consider? It seems like a is the
right answer but your input is appreciated...
Ross.