473,385 Members | 1,930 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

When denormalizing is essential

I have a database with over 450 tables, that until a short while ago
has been strictly 3NF. In cases where complicated permutations of
data are required quickly for certain parts of the system, sometimes
it is not possible to write a query fast enough, due to the complexity
of deriving many nested levels of data.

Therefore for the few instances where this has been needed, I have
created carefully named summary tables, having one-to-one
relationships to real tables (same primary key), and containing only
the values which must be computed.

This works fine. The question now is when to compute them.

Much as I strongly dislike triggers, I have created triggers for the
tables in question, which themselves recalculate the summary values,
and populate the summary tables.

Good news, loading is much faster.
Bad news, now that I'm taking the hit on the save, the saves are far
too slow.

The only way I can see that my saves could be sped up, is to perhaps
spread the work a bit, such that maybe if I am adding fifty records to
a table, that the trigger is somehow disabled before the rows are
added, and re-enabled afterwards, to calculate the lot as a set-based
operation (rather than calling the trigger 50 times), but I don't see
a way to do that.

Any help of any nature on this matter would be greatly appreciated

Cheers

Brian McGee
Jul 20 '05 #1
10 2224

"Brian McGee" <br*********@Sentrio.com> wrote in message
news:83**************************@posting.google.c om...
I have a database with over 450 tables, that until a short while ago
has been strictly 3NF. In cases where complicated permutations of
data are required quickly for certain parts of the system, sometimes
it is not possible to write a query fast enough, due to the complexity
of deriving many nested levels of data.

Therefore for the few instances where this has been needed, I have
created carefully named summary tables, having one-to-one
relationships to real tables (same primary key), and containing only
the values which must be computed.

This works fine. The question now is when to compute them.

Instead of creating these as tables, have you considered creating these as
views (or even inline-functions in certain cases)
Much as I strongly dislike triggers, I have created triggers for the
tables in question, which themselves recalculate the summary values,
and populate the summary tables.

Good news, loading is much faster.
Bad news, now that I'm taking the hit on the save, the saves are far
too slow.

The only way I can see that my saves could be sped up, is to perhaps
spread the work a bit, such that maybe if I am adding fifty records to
a table, that the trigger is somehow disabled before the rows are
added, and re-enabled afterwards, to calculate the lot as a set-based
operation (rather than calling the trigger 50 times), but I don't see
a way to do that.

Any help of any nature on this matter would be greatly appreciated


How are your triggers act
Jul 20 '05 #2
Oops - I shall try and complete my email this time, and add a capital 'S',
rather than "(S)end message"

"Steven Wilmot" <st*********@wilmot.me.uk> wrote in message
news:40***********************@news.aaisp.net.uk.. .

"Brian McGee" <br*********@Sentrio.com> wrote in message
news:83**************************@posting.google.c om...
I have a database with over 450 tables, that until a short while ago
has been strictly 3NF. In cases where complicated permutations of
data are required quickly for certain parts of the system, sometimes
it is not possible to write a query fast enough, due to the complexity
of deriving many nested levels of data.
Ah - misunderstanding of "not possible to write the the query quickly" and
not possible to get the query to execute quickly-enough ... Oops
Therefore for the few instances where this has been needed, I have
created carefully named summary tables, having one-to-one
relationships to real tables (same primary key), and containing only
the values which must be computed.

This works fine. The question now is when to compute them.


Instead of creating these as tables, have you considered creating these as
views (or even inline-functions in certain cases)


Ignore my comment above - I mis-read your first post
Much as I strongly dislike triggers, I have created triggers for the
tables in question, which themselves recalculate the summary values,
and populate the summary tables.

Good news, loading is much faster.
Bad news, now that I'm taking the hit on the save, the saves are far
too slow.

The only way I can see that my saves could be sped up, is to perhaps
spread the work a bit, such that maybe if I am adding fifty records to
a table, that the trigger is somehow disabled before the rows are
added, and re-enabled afterwards, to calculate the lot as a set-based
operation (rather than calling the trigger 50 times), but I don't see
a way to do that.

Any help of any nature on this matter would be greatly appreciated


How are your triggers act


How are the triggers written ? Are they designed to work against the entire
inserted/deleted virtual tables, or do they cursor through each row ...
Worse still, do they only work on a single row .

How much of a performance hit have you noticed on the inserts ? There's a
possibility that simplifying the trigger's operation may make things easier.

Also, how important is it that your summary information is up-to-the minute.

One optiouns is a simple trigger that puts all changed records in a
"queue-table", and then have a job that processes anything in this
queue-table once a minute.

Hopefully a few thoughts for you to consider.

Steven
Jul 20 '05 #3
Brian McGee (br*********@Sentrio.com) writes:
Therefore for the few instances where this has been needed, I have
created carefully named summary tables, having one-to-one
relationships to real tables (same primary key), and containing only
the values which must be computed.


Steven Wilmot first suggested views and then retracted it, when he
realized your scope.

But did you consider *indexed* views? That is, views that are materialized,
and then SQL Server will maintain them. Of course this comes with an
overhead for updates, but not as big as for triggers.

On the other hand, far from everything can be implemented as indexed views,
so this may not be a solution. But it's worth investigating.

More details are in the CREATE INDEX topic in Books Online.

--
Erland Sommarskog, SQL Server MVP, so****@algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #4
> How are the triggers written ? Are they designed to work against the
entire
inserted/deleted virtual tables, or do they cursor through each row ... Worse still, do they only work on a single row .
They only work on the row being inserted or deleted. Can you do
otherwise??
How much of a performance hit have you noticed on the inserts ? There's a possibility that simplifying the trigger's operation may make things easier.

I have tried, but the calculations are extremely complicated. They
are running significantly faster than they did when first written, but
have reached their peak and are still too slow!

Also, how important is it that your summary information is up-to-the minute.
Unfortunately it's critical!
But did you consider *indexed* views? That is, views that are materialized, and then SQL Server will maintain them. Of course this comes with an
overhead for updates, but not as big as for triggers.

When are they materialized? I've had more than one scenario with
SQL Server where the documentation says one thing and testing says
another, when it comes to when computed values are materialized!

Brian
*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!
Jul 20 '05 #5
On 05 Apr 2004 09:07:29 GMT, Brian McGee wrote:
How are the triggers written ? Are they designed to work against theentire
inserted/deleted virtual tables, or do they cursor through each row

..
Worse still, do they only work on a single row .


They only work on the row being inserted or deleted. Can you do
otherwise??


Yes. Any good designed trigger will work on all records in the
inserted and/or deleted pseudo-tables at once.

I've seen triggers that read one row from inserted/deleted into some
variables and proicess those. They will not work as expected as soon
as you insert. update or delete more than one row in an operation.

I've seen triggers that loop through all rows in inserted/deleted,
either through a cursor or through some nifty SQL logic. They will do
what you expect them to, but they're slow.

A good trigger uses the set-oriented approach towards the inserted and
deleted tables. That's what SQL Server is optimised for. If you can
give us some samples of your current trigger code, we might be able to
improve on them.

I have tried, but the calculations are extremely complicated. They
are running significantly faster than they did when first written, but
have reached their peak and are still too slow!


Again: post some samples. You'd be astonished what the combined brain
power of the regular posters here might come up with.

Also, how important is it that your summary information is up-to-the

minute.
Unfortunately it's critical!


In that case, your own suggestion to temporarily disable the trigger
and do all calculations after enough inserts have been processed is
not the way to go.

But did you consider *indexed* views? That is, views that are

materialized,
and then SQL Server will maintain them. Of course this comes with an
overhead for updates, but not as big as for triggers.

When are they materialized? I've had more than one scenario with
SQL Server where the documentation says one thing and testing says
another, when it comes to when computed values are materialized!

If you define a view WITH SCHEMABINDING and then create a UNIQUE
CLUSTERED INDEX on that view, the view will be materialized. Meaning
that the select statement in the view declaration will be executed and
the result set stored, just as if you'd made a new table; after that,
SQL Server will see to it that changes to the underlying tables will
propagate to the view. This increases the execution time for changes
to the base tables but decreases the time to query the view. Plus, the
extra index might sometimes be used to get better execution plans.

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)
Jul 20 '05 #6

Thanks!

I've tried Indexed Views on a small-scale test and they seem to work a
treat. I've a bit of remixing to do before the application as a whole
will use them, but I'm confident now that they will work.

Much appreciated :-)

Brian
*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!
Jul 20 '05 #7

Grrr.. Ok. Now that I've tried using indexed views for what I actually
need them for, I find that sum, min, max, outer joins, subqueries, etc
are not allowed. What's the bloody use of indexed views if you can't
write complex queries? So I'm back to square one :(

Anyone any ideas?

Brian

*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!
Jul 20 '05 #8

Grrr.. Ok. Now that I've tried using indexed views for what I actually
need them for, I find that sum, min, max, outer joins, subqueries, etc
are not allowed. What's the bloody use of indexed views if you can't
write complex queries? So I'm back to square one :(

Anyone any ideas?

Brian

*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!
Jul 20 '05 #9
On 13 Apr 2004 09:57:15 GMT, Brian McGee wrote:

Grrr.. Ok. Now that I've tried using indexed views for what I actually
need them for, I find that sum, min, max, outer joins, subqueries, etc
are not allowed. What's the bloody use of indexed views if you can't
write complex queries? So I'm back to square one :(

Anyone any ideas?


Brian,

Too bad that didn't work out.

As Steven and I already suggested before in this thread, we can only
take a stab at optimizing your code if we get to see it. The best way
to get help in this newsgroup is to post the following:

1. DDL for all tables that participate in the problem. You can leave
out the columns that are irrelevant to the situation. Do include all
indexes and all constraints you have.

2. (If necessary to reproduce a problem) Sample data that shows the
problem. Post this in the form of INSERT statements, so that we can
copy and paste it into Query Analyzer and execute it there. Most of us
don't have the time to manually enter a whole set of data. Use the
minimum amount of data necessary to reproduce the problem.
(Note - there are scripts that can generate this kind of output for
you. I think I saw a reference to it not too long ago in
microsoft.public.sqlserver.programming. Google for it if you are
interested.)
(Note to Brian - since you don't ask about strange behaviour but about
performance, you can skip this step, though it might help put across
what you're trying to accomplish)

3. The output you expect.

Make sure to test if executing the DDL and inserts from staps 1 and 2
work as expected in an empty database. Then, copy it in your message
and post.

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)
Jul 20 '05 #10
On 13 Apr 2004 09:57:15 GMT, Brian McGee wrote:

Grrr.. Ok. Now that I've tried using indexed views for what I actually
need them for, I find that sum, min, max, outer joins, subqueries, etc
are not allowed. What's the bloody use of indexed views if you can't
write complex queries? So I'm back to square one :(

Anyone any ideas?


Brian,

Too bad that didn't work out.

As Steven and I already suggested before in this thread, we can only
take a stab at optimizing your code if we get to see it. The best way
to get help in this newsgroup is to post the following:

1. DDL for all tables that participate in the problem. You can leave
out the columns that are irrelevant to the situation. Do include all
indexes and all constraints you have.

2. (If necessary to reproduce a problem) Sample data that shows the
problem. Post this in the form of INSERT statements, so that we can
copy and paste it into Query Analyzer and execute it there. Most of us
don't have the time to manually enter a whole set of data. Use the
minimum amount of data necessary to reproduce the problem.
(Note - there are scripts that can generate this kind of output for
you. I think I saw a reference to it not too long ago in
microsoft.public.sqlserver.programming. Google for it if you are
interested.)
(Note to Brian - since you don't ask about strange behaviour but about
performance, you can skip this step, though it might help put across
what you're trying to accomplish)

3. The output you expect.

Make sure to test if executing the DDL and inserts from staps 1 and 2
work as expected in an empty database. Then, copy it in your message
and post.

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)
Jul 20 '05 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

217
by: gyromagnetic | last post by:
The following url points to an article written by Damian Conway entitled "Ten Essential Development Practices": http://www.perl.com/pub/a/2005/07/14/bestpractices.html Althought the article has...
4
by: P.Jayant | last post by:
ATI insists that dotnet 1.1 Must be installed for using several of the recent videocards of ATI. As a simple user of TV program capture functionality, I would like to understand what in dotnet is...
14
by: Steven D'Aprano | last post by:
I came across an interesting (as in the Chinese curse) problem today. I had to modify a piece of code using generator expressions written with Python 2.4 in mind to run under version 2.3, but I...
2
by: nuoo | last post by:
Essential ASP.NET With Examples in C# Description : The Microsoft .NET Framework is exactly what its name implies: A general system onto which a lot of application-specific technologies are...
1
by: =?Utf-8?B?TW9uaWNh?= | last post by:
Hi, An ASP.NET Web application was created in C# and you want to deploy it to a host server on the Internet. The application was configured with code-behind pages which are compiled in the build...
7
by: Dick Moores | last post by:
In a couple of places recently I've seen Brent Welch's _Practical Programming in Tcl & Tk_ (<http://tinyurl.com/ynlk8b>) recommended for learning Tkinter well. So a couple of questions: 1) Is...
27
by: seberino | last post by:
Please help me think of an example where immutable tuples are essential. It seems that everywhere a tuple is used one could just as easily use a list instead. chris
2
by: Longjun | last post by:
Hi, everybody. I just want to access to the electronic version of <Essential COM>. But I cannot find out it on google search engine. Is it to protect the book copy right? Who can provide me...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.