By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
454,965 Members | 1,136 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 454,965 IT Pros & Developers. It's quick & easy.

Normalization and AVG across columns

P: n/a
I have a table that holds the employee id, a job name and then 5
columns (Proposal Phase Rating, Preconstruction Phase Rating, etc.)
that hold a number rating 1-5. I've read that this design may not be
considered "normalized" due to the fact that I will need to get an
average (Job Average) across the columns. My question(s) is:

1) Should I break this table down differently, having 1 column for
EmpID, 1 column for Job Name, 1 column for "Rating Type" and 1 column
for "Rating Score"? If I do alter the table to reflect that changes in
the previous sentence, doesn't that provide for extra data entries (ie
Job name and EmpID have to be entered in each time a phases score is
entered rather than each time a new job is entered).

2) Unrelated to the questions above, I have a table with more than 40
columns called Emp_Main. This table holds 1 record for each employee
and includes emergency contact info, benefit info, etc. There will be
less than 300 records in this table. Is it better for me to break the
table down so that Contact Info, Benefit Info, etc are their own table?
Is it better to have 1 large table or many small ones?

Thanks,
Kovax

Mar 1 '06 #1
Share this Question
Share on Google+
3 Replies


P: n/a
Kovax,

1) That sounds fine. Technically, there's still some repetition in the
JobName field, but that's probably not such a huge sin.

2) No. You don't want to brek out fields into separate tables unless
they repeat at different intervals, or for different reasons. For
instance, your Emp_Main table has one record per employee. In your
other table, there's more than one record per employee, so those fields
are (rightly) broken out into another table. So, unless you need to
capture history of those other fields, I would say just keep them in
the same table.

For a good write-up of the reasoning behind decisions like this, have a
look at Paul Litwin's excellent article "The Fundamentals of Relational
Database Design". There's a copy of it, with his permission, on my old
website http://www.abcdataworks.com in the Developers' section.

Jeremy
--
Jeremy Wallace
Fund for the City of New York
http://metrix.fcny.org

Mar 1 '06 #2

P: n/a
In message <11**********************@z34g2000cwc.googlegroups .com>,
br****@rstenstrom.com writes
I have a table that holds the employee id, a job name and then 5
columns (Proposal Phase Rating, Preconstruction Phase Rating, etc.)
that hold a number rating 1-5. I've read that this design may not be
considered "normalized" due to the fact that I will need to get an
average (Job Average) across the columns. My question(s) is:
This is a bit of database theory that covers normalisation (I'm in the
UK, I use UK spelling.) Forget that you are using Access for a while,
because Access 12 might have new features.

You've done the difficult part and realised that if you store the
average value in that table then it won't be normalised. The next thing
to decide is whether that really matters to you. Sometimes it's OK to
have denormalised data, as long as you recognise that and are prepared
for the consequences.

Normalisation can cause problems in any of the fundamental data
operations; add, modify, delete. If you understand what can happen, and
are prepared for it, denormalisation is OK. It usually requires some
extra programming to keep the dependent data synchronised.

In this particular instance a delete operation doesn't require any
special action because you can delete the individual and average data
simultaneously. They are all held in the same record, there's no risk.

Adding a record has a risk, you need to be sure that when the record is
created the average field is filled with the correct value. You also
need to consider pathological conditions. What is the average if one of
the contributing values is null? Is this an error? Should you use the
average of the values you have, or return a null average? It's your
data, you need to answer these questions before you commit to a database
design.

Amending a record has similar risks. If you amend one of the
contributing figures you have to be sure that the average is
recalculated and stored before anyone else can read the wrong data. Just
as when adding a record you need to consider what to do with null data.

How you resolve these issues depends on the database implementation and
the performance requirements.

You could store only the contributing figures and calculate the average
on the fly each time. You keep the database normalised but at the cost
of having to repeat the same calculation each time you read the data. If
the data is written and never modified this is wasteful. (If your
database supports them you could create a view (a Query in Access) and
include the average as a calculated field.)

If your database supports them you could use triggers to recalculate the
average and store it whenever a record is added. If it doesn't (The Jet
database in Access doesn't) you can write some middleware code and have
your applications talk to the middleware instead of talking direct to
the database.

1) Should I break this table down differently, having 1 column for
EmpID, 1 column for Job Name, 1 column for "Rating Type" and 1 column
for "Rating Score"? If I do alter the table to reflect that changes in
the previous sentence, doesn't that provide for extra data entries (ie
Job name and EmpID have to be entered in each time a phases score is
entered rather than each time a new job is entered).

2) Unrelated to the questions above, I have a table with more than 40
columns called Emp_Main. This table holds 1 record for each employee
and includes emergency contact info, benefit info, etc. There will be
less than 300 records in this table. Is it better for me to break the
table down so that Contact Info, Benefit Info, etc are their own table?
Is it better to have 1 large table or many small ones?


If you have sensitive data like emergency contact data then you need to
keep it safe. That's probably easier if you have fewer tables. It's also
easier to extract the data your management need to have at home in case
the office burns down.

Database design is partly driven by the theoretical (conceptual &
logical) structure of the data. You need that before you decide what to
store. It's also driven by practicality, your choice of database
software and the purpose of the data. There are usually several
different ways of building the database from a logical structure.
Different structures optimise different things, database designers
should know what needs to be optimised in any particular situation.


--
Bernard Peek
London, UK. DBA, Manager, Trainer & Author.

Mar 1 '06 #3

P: n/a
Thanks gentleman. I appreciate your time and information.

Mar 1 '06 #4

This discussion thread is closed

Replies have been disabled for this discussion.