473,387 Members | 3,787 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

Normalization and AVG across columns

I have a table that holds the employee id, a job name and then 5
columns (Proposal Phase Rating, Preconstruction Phase Rating, etc.)
that hold a number rating 1-5. I've read that this design may not be
considered "normalized" due to the fact that I will need to get an
average (Job Average) across the columns. My question(s) is:

1) Should I break this table down differently, having 1 column for
EmpID, 1 column for Job Name, 1 column for "Rating Type" and 1 column
for "Rating Score"? If I do alter the table to reflect that changes in
the previous sentence, doesn't that provide for extra data entries (ie
Job name and EmpID have to be entered in each time a phases score is
entered rather than each time a new job is entered).

2) Unrelated to the questions above, I have a table with more than 40
columns called Emp_Main. This table holds 1 record for each employee
and includes emergency contact info, benefit info, etc. There will be
less than 300 records in this table. Is it better for me to break the
table down so that Contact Info, Benefit Info, etc are their own table?
Is it better to have 1 large table or many small ones?

Thanks,
Kovax

Mar 1 '06 #1
3 2181
Kovax,

1) That sounds fine. Technically, there's still some repetition in the
JobName field, but that's probably not such a huge sin.

2) No. You don't want to brek out fields into separate tables unless
they repeat at different intervals, or for different reasons. For
instance, your Emp_Main table has one record per employee. In your
other table, there's more than one record per employee, so those fields
are (rightly) broken out into another table. So, unless you need to
capture history of those other fields, I would say just keep them in
the same table.

For a good write-up of the reasoning behind decisions like this, have a
look at Paul Litwin's excellent article "The Fundamentals of Relational
Database Design". There's a copy of it, with his permission, on my old
website http://www.abcdataworks.com in the Developers' section.

Jeremy
--
Jeremy Wallace
Fund for the City of New York
http://metrix.fcny.org

Mar 1 '06 #2
In message <11**********************@z34g2000cwc.googlegroups .com>,
br****@rstenstrom.com writes
I have a table that holds the employee id, a job name and then 5
columns (Proposal Phase Rating, Preconstruction Phase Rating, etc.)
that hold a number rating 1-5. I've read that this design may not be
considered "normalized" due to the fact that I will need to get an
average (Job Average) across the columns. My question(s) is:
This is a bit of database theory that covers normalisation (I'm in the
UK, I use UK spelling.) Forget that you are using Access for a while,
because Access 12 might have new features.

You've done the difficult part and realised that if you store the
average value in that table then it won't be normalised. The next thing
to decide is whether that really matters to you. Sometimes it's OK to
have denormalised data, as long as you recognise that and are prepared
for the consequences.

Normalisation can cause problems in any of the fundamental data
operations; add, modify, delete. If you understand what can happen, and
are prepared for it, denormalisation is OK. It usually requires some
extra programming to keep the dependent data synchronised.

In this particular instance a delete operation doesn't require any
special action because you can delete the individual and average data
simultaneously. They are all held in the same record, there's no risk.

Adding a record has a risk, you need to be sure that when the record is
created the average field is filled with the correct value. You also
need to consider pathological conditions. What is the average if one of
the contributing values is null? Is this an error? Should you use the
average of the values you have, or return a null average? It's your
data, you need to answer these questions before you commit to a database
design.

Amending a record has similar risks. If you amend one of the
contributing figures you have to be sure that the average is
recalculated and stored before anyone else can read the wrong data. Just
as when adding a record you need to consider what to do with null data.

How you resolve these issues depends on the database implementation and
the performance requirements.

You could store only the contributing figures and calculate the average
on the fly each time. You keep the database normalised but at the cost
of having to repeat the same calculation each time you read the data. If
the data is written and never modified this is wasteful. (If your
database supports them you could create a view (a Query in Access) and
include the average as a calculated field.)

If your database supports them you could use triggers to recalculate the
average and store it whenever a record is added. If it doesn't (The Jet
database in Access doesn't) you can write some middleware code and have
your applications talk to the middleware instead of talking direct to
the database.

1) Should I break this table down differently, having 1 column for
EmpID, 1 column for Job Name, 1 column for "Rating Type" and 1 column
for "Rating Score"? If I do alter the table to reflect that changes in
the previous sentence, doesn't that provide for extra data entries (ie
Job name and EmpID have to be entered in each time a phases score is
entered rather than each time a new job is entered).

2) Unrelated to the questions above, I have a table with more than 40
columns called Emp_Main. This table holds 1 record for each employee
and includes emergency contact info, benefit info, etc. There will be
less than 300 records in this table. Is it better for me to break the
table down so that Contact Info, Benefit Info, etc are their own table?
Is it better to have 1 large table or many small ones?


If you have sensitive data like emergency contact data then you need to
keep it safe. That's probably easier if you have fewer tables. It's also
easier to extract the data your management need to have at home in case
the office burns down.

Database design is partly driven by the theoretical (conceptual &
logical) structure of the data. You need that before you decide what to
store. It's also driven by practicality, your choice of database
software and the purpose of the data. There are usually several
different ways of building the database from a logical structure.
Different structures optimise different things, database designers
should know what needs to be optimised in any particular situation.


--
Bernard Peek
London, UK. DBA, Manager, Trainer & Author.

Mar 1 '06 #3
Thanks gentleman. I appreciate your time and information.

Mar 1 '06 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Nikola Pecigos | last post by:
Hi, I have the following problem: We have an Oracle 9.2 with one table "document" which contains a path to the filesystem. If I want to index these files (HTML, PDF, World, Excel, etc.), I...
4
by: William D. Bartholomew | last post by:
We need to store land title information about properties in various Australian states, but each state maintains it's own land title registry and use different columns (well actually different...
5
by: deko | last post by:
I've heard it's best to have a meaningless primary key (such as an autonumber field) as opposed to a pk that contains volatile data. This sounds reasonable enough. In fact, I think I'd need a...
4
by: smitj2 | last post by:
Hi I am studying an IT course and thought I was just beginning to master the basics of normalization. However I came across an example in my text book that wasnt what I would have arrived at. ...
4
by: Takeadoe | last post by:
Dear NGs, I recently downloaded and read a bunch of material on normalizing your data and db design. Things aren't crystal clear yet! Part of the problem is that nearly every thing I read...
76
MMcCarthy
by: MMcCarthy | last post by:
Normalisation is the term used to describe how you break a file down into tables to create a database. There are 3 or 4 major steps involved known as 1NF (First Normal Form), 2NF (Second Normal...
20
by: hippomedon | last post by:
Hello everyone, I'm looking for some advice on whether I should break the normalization rule. Normally, I would not consider it, but this seems to be a special case. I have created an...
1
pbmods
by: pbmods | last post by:
Heya. I'm adding features to a pre-existing project, and I was curious about the way I've been going about designing my database additions. My methodology is to keep my stuff separate from the...
8
by: Sham | last post by:
I am trying to perform the following query on a table that has been indexed using Full Text Search. The table contains multiple columns than have been indexed. (Below, all xml columns are...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.