473,554 Members | 2,447 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

SET vs Fully Normalised vs Both

Hi,

I'm looking for some advice on database structure.

Without unnecessary details, I have a database that deals, in part,
with 'matches' each of which progresses through a series of statuses.
I need to store when the match progressed to each of these statuses.

At present, I have two tables, matches and matchstatuses:

matches:
matchid (PK)
etc...

matchstatus:
matchstatusid (PK)
matchid
status (ENUM)
datechanged
etc...

This works fine, except that it's somewhat messy to query, and I
assume potentially slow, as I have to resort to using HAVING ... an
awful lot.

What I was considering is using a SET field, status, in the matches
table, whose values coincided with the status ENUM. I would just
update this at the same time as any inserts into matchstatuses. I'm
using transactions, and the code is reasonably simple, so maintaining
consistency isn't an issue. This would certainly simplify a lot of my
queries, as most don't require the date information.

My main concern is with style. This seems a somewhat messy solution
to the problem, and I can well imagine such redundancy being frowned
upon.

I did also consider just adding a set of fields to the match table,
each nullable, and corresponding to a given status. This however
seemed wrong as:

1) A lot of the matches will not go through a lot of the statuses,
leading to wasted storage of 4 bytes per status per record.

2) I'd rather update the ENUM / SET definitions than add additional
fields if new statuses need to be added.

3) ENUMs and SETs are sort of normalised, whereas having a field per
status clearly isn't.

I'd be very grateful for any thoughts on whether using an ENUM and SET
in the above mentioned way is considered very bad form, as well as
whether using a field per status is better or worse. I'd also be very
grateful of any suggestions as to an better approach to this.

Many thanks,

Chris Key
Jul 20 '05 #1
2 1873
Christopher Key wrote:
My main concern is with style. This seems a somewhat messy solution
to the problem, and I can well imagine such redundancy being frowned
upon.
I try to avoid redundancy because there's a risk that the data will
become inconsistent. There's the academic issue of style or relational
normalization, but the more practical risk that is more likely to bite
you is data integrity.
I did also consider just adding a set of fields to the match table,
each nullable, and corresponding to a given status. This however
seemed wrong as:

1) A lot of the matches will not go through a lot of the statuses,
leading to wasted storage of 4 bytes per status per record.

2) I'd rather update the ENUM / SET definitions than add additional
fields if new statuses need to be added.

3) ENUMs and SETs are sort of normalised, whereas having a field per
status clearly isn't.


Actually, I believe they're equivalent with respect to breaking
normalization. But it's only a fine point; normalization often has to
be broken to achieve some tradeoff with query simplicity or efficiency.

Here's another possible solution: you could keep the two-table design,
and create a "virtual" column-per-status-value, by doing an outer join
to the matchstatus table for each status value:

select distinct m.*, ms1.status, ms2.status, ms3.status
from matches m
left outer join matchstatus ms1 on m.matchid = ms1.matchid and
ms1.status = 'STATUS1'
left outer join matchstatus ms2 on m.matchid = ms2.matchid and
ms2.status = 'STATUS2'
left outer join matchstatus ms3 on m.matchid = ms3.matchid and
ms3.status = 'STATUS3'

That way you don't need to add values to the SET or add fields to the
matches table if you add values. But you would need to add another
joined instances of the matchstatus table to your query.

Regards,
Bill K.
Jul 20 '05 #2
> > My main concern is with style. This seems a somewhat messy solution
to the problem, and I can well imagine such redundancy being frowned
upon.


I try to avoid redundancy because there's a risk that the data will
become inconsistent. There's the academic issue of style or relational
normalization, but the more practical risk that is more likely to bite
you is data integrity.


Thanks, will certainly keep this in mind. Fortunately, there are very
few different sections of code that perform any updating, and I will
be using transactions, but still something to avoid if possible.
I did also consider just adding a set of fields to the match table,
each nullable, and corresponding to a given status. This however
seemed wrong as:

1) A lot of the matches will not go through a lot of the statuses,
leading to wasted storage of 4 bytes per status per record.

2) I'd rather update the ENUM / SET definitions than add additional
fields if new statuses need to be added.

3) ENUMs and SETs are sort of normalised, whereas having a field per
status clearly isn't.


Actually, I believe they're equivalent with respect to breaking
normalization. But it's only a fine point; normalization often has to
be broken to achieve some tradeoff with query simplicity or efficiency.

Here's another possible solution: you could keep the two-table design,
and create a "virtual" column-per-status-value, by doing an outer join
to the matchstatus table for each status value:

select distinct m.*, ms1.status, ms2.status, ms3.status
from matches m
left outer join matchstatus ms1 on m.matchid = ms1.matchid and
ms1.status = 'STATUS1'
left outer join matchstatus ms2 on m.matchid = ms2.matchid and
ms2.status = 'STATUS2'
left outer join matchstatus ms3 on m.matchid = ms3.matchid and
ms3.status = 'STATUS3'

That way you don't need to add values to the SET or add fields to the
matches table if you add values. But you would need to add another
joined instances of the matchstatus table to your query.


Yes, this seems like the best solution. I've actually used a similar
joining technique elsewhere in the project, and am amazed that I
didn't think to apply it here. This will certainly make for far more
readable code than present:

....
GROUP BY matchid
HAVING NOT NOT (
BIT_OR(status = 'STATUS1') AND NOT BIT_OR(status=' STATUS2')
)

I assume it'll also be potentially more efficient aswell, with MySQL
not having to faff around with aggregate functions etc.
Many thanks for the advice,

Chris Key
Jul 20 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
3001
by: vze29xhy | last post by:
I have searched the existing responses in this forum and others and could not find a solution. I have a database on MySQL server Version "4.0.13". My table structure is as follows TABLE1:MusicCD This table has three columns - UPC, Title, ExtendedTitle TABLE2: Artist This table has three columns - ArtistID, ArtistName, ArtistOtherName
3
3611
by: Lewis | last post by:
I have "My Documents" redirected to a network share but when I go to open a project it says that the share is not fully trusted. I found this info: > The simplest (least work) in this situation is to change the machine policy (those 3 options are known as policy levels). Under the Intranet zone groop, add a new code group with a Membership...
6
1598
by: swartzbill2000 | last post by:
Hello, I have some downloaded source code on my machine that the .net framework thinks is "not fully trusted". How can I fix this? I assume I use the 'Microsoft .NET Framework 1.1 Configuration' in some way. Bill
23
3599
by: Xah Lee | last post by:
The Concepts and Confusions of Pre-fix, In-fix, Post-fix and Fully Functional Notations Xah Lee, 2006-03-15 Let me summarize: The LISP notation, is a functional notation, and is not a so-called pre-fix notation or algebraic notation. Algebraic notations have the concept of operators, meaning, symbols placed around arguments. In...
0
2531
by: Rob Maui | last post by:
I use the following code in an aspx file to read the file name which I am uploading from the client PC onto the server; Dim MyFileName As String Dim MyFileColl As HttpFileCollection = Request.Files MyFileName = MyFileColl.Item(0).FileName or (alos tried) Dim PostFile As System.Web.HttpPostedFile
3
1775
by: meeraguna | last post by:
We have 20 -30 normalized tables in our dartabase . Also we have 4 tables where we store the calculated data fron those normalised tables. The Reason we have these 4 denormalised tables is when we try to do the calcultion on the fly, our site becomes very slow. So We have precalculated and stored it in 4 tables. The Process we use to do...
0
951
by: ajayindelhi | last post by:
Hi, I have two user accounts in the format: <domain>\<alias> and <fully qualified domain name>\<alias>
69
2587
by: Bill Reid | last post by:
This is how I handle a check that the last character of a text file is a newline: /* checks if newline is last character of text file */ unsigned check_text_file_newline_termination(FILE *test_file) { int end_char; fseek(test_file,-1L,SEEK_END); end_char=getc(test_file);
0
1246
by: kkshansid | last post by:
S.N. Name of NGOs Boys Girls No. of Students Passed Failed Employment Status Grant Released i have this type of data in table format for each Finical Year(2004-05 to2009-10) kindly help me to design a single normalised table
0
7903
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
0
6176
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
1
5452
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
5170
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3593
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
0
3573
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2034
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1156
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
863
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.