By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,682 Members | 1,999 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,682 IT Pros & Developers. It's quick & easy.

Database Statistics... Best way to maintain stats???

P: n/a
Hi all,
I have an application which logs a considerable amount of data. Each day,
we log about 50,000 to 100,000 rows of data.

We like to report on this data... currently I'm using a stored procedure to
calculate the statistics, however since this is an ad hoc, reports take a
while to generate.

So how do you guys handle large amounts of data? Is there a good way to
precalculate a a set of statistics to handle ad hoc queries (i.e. By Hour,
By Day, By Week, By Month). Our application also provides near realtime
statistics... so precalculation has to be done on a continual basis. Does
..NET have any statistics classes that might help out with this sort of
thing? I don't think Performance counters will work since they don't log
persistent data.

Any ideas?

Thanks!

--
Lucas Tam (RE********@rogers.com)
Please delete "REMOVE" from the e-mail address when replying.
http://members.ebay.com/aboutme/coolspot18/
Nov 19 '05 #1
Share this Question
Share on Google+
2 Replies


P: n/a
you might look at using cubes for rollups. if you need realtime adhoc stats
(with slice and dice), then you want to use a snowflake or star schema.

since you are only doing 100k rows a day (pretty small actually), a simple
star schema should doit (with so rollups). with this low of volume, i'd
update the rollup tables in real time (simple trigger). i'd expect no query
to take over a second or two (unless it was years of detail).

-- bruce (sqlwork.com)
"Lucas Tam" <RE********@rogers.com> wrote in message
news:Xn***************************@127.0.0.1...
Hi all,
I have an application which logs a considerable amount of data. Each day,
we log about 50,000 to 100,000 rows of data.

We like to report on this data... currently I'm using a stored procedure
to
calculate the statistics, however since this is an ad hoc, reports take a
while to generate.

So how do you guys handle large amounts of data? Is there a good way to
precalculate a a set of statistics to handle ad hoc queries (i.e. By Hour,
By Day, By Week, By Month). Our application also provides near realtime
statistics... so precalculation has to be done on a continual basis. Does
.NET have any statistics classes that might help out with this sort of
thing? I don't think Performance counters will work since they don't log
persistent data.

Any ideas?

Thanks!

--
Lucas Tam (RE********@rogers.com)
Please delete "REMOVE" from the e-mail address when replying.
http://members.ebay.com/aboutme/coolspot18/

Nov 19 '05 #2

P: n/a
or, if you don't want to get into hypercubes, you can

"squash" the data from the first table into a second table

with an aggregate query which adds, at minimum, a RecordCount column
and removes as much unnecessary detail as possible

then, if you need to include both tables in a query, you
can create a union query that simply adds a RecordCount of 1
to the first table so it has the same number of cols

ie: select field1, field2, field3, 1 as recordcount from table1 union
all select field1, field2, field3, recordcount from table2

there is an efficient way to do the transfer

you create a transactional stored procedure that uses a temporary table
to index the primary keys
of the records to be transferred (usually based on date, say 10,000
every 5 minutes or whatever fits)
so it looks roughly like this:

select top 10000 primary key into #temp from table1 order by date
--
insert into table2 (field1, field2, field3, recordcount)
select field1, field2, field3, count(*)
from table1
where primarykey in (select primarykey from #temp)
--
delete
from table1
where primarykey in (select primarykey from #temp)

(you will have to add the transaction and error handling bits yourself)

i used that when getting around 100,000 records per day with heaps
of detail and it compressed nicely about 10 to 1 and my nastiest
olap-style
t-sql queries (12 months of data) took around 15 seconds

and you can easily simulate olaps intermediate calculations
with a single table refreshed every day or on demand

Nov 19 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.