> When I asked about how to deal with increasing volumes of data, you
answered
"sound database design". Do you have any examples?
I don't really have any generic examples of "sound database design"... it's
just a matter of proper normalization and designing your tables correctly
for the information you want to store, and not designing them for the
information you don't care about. Depending on the needs of the
application, it might be that you want the design optimized for INSERT
performance, and the people querying the data can suffer a little worse
performance for it. Or, it might be that the people querying the data are
more demanding (or it happens more) so that is where you should spend your
time focusing on efficiency of design. I really don't know. But I can
suggest at least a couple of ways to prevent such an app from chewing disk
space all day and night (at slight query performance costs).
For log applications in particular, there is often a whole lot of useless /
redundant information. One thing is that a lot of people store the entire
user agent string in every single row. What we did in one application is
have a set of *common* browsers, taught our application how to lump them
into categories, and stored their representation instead. So we had a table
with a TINYINT column and a VARCHAR(32) column, which stored the "browser
map" (about 20 rows) and then only stored the tinyint value with the large
set of data. This reduced the large fact tables from a VARCHAR(255) to a
TINYINT, so a savings of 254 bytes per row. Basically:
CREATE TABLE BrowserMap
(
BrowserMapID TINYINT
PRIMARY KEY CLUSTERED
BrowserName VARCHAR(32)
)
CREATE TABLE RawLog
(
RawLogID INT IDENTITY(1, 1)
PRIMARY KEY NONCLUSTERED,
dtEvent DATETIME NOT NULL,
BrowserMapID TINYINT FOREIGN KEY
REFERENCES BrowserMap(BrowserMapID),
...
)
Another thing we did is store the IP address in four TINYINT columns, rather
than a big VARCHAR(15). There is a discussion about various ways to reduce
storage space for an IP in
http://www.aspfaq.com/2450, and some general
database efficiency suggestions at
http://www.aspfaq.com/2424#db