Microsoft Access, Double Data Entry and breaking the Normalization rule

hippomedon

Hello everyone,

I'm looking for some advice on whether I should break the
normalization rule. Normally, I would not consider it, but this seems
to be a special case.

I have created an "Outcomes Database" used to store response data
from measures/ questionnaires for a longitudinal health study. It is
essentially derived from Duane Hookom's Survey Database (thanks
Duane!!!), with many modifications added to fit the needs of my lab.
One of the additions, is the ability to do "Double Data Entry" in
order to verify the data (as required by the grant).

All the data is entered into the "tbl_DateE", with the following 5
Key fields: [DtaPartID] (Participant ID), [DtaQstnE] (evaluation
number), [DtaSrvID] (Survey ID), [DtaEntry] (1st versus 2nd entry),
[DtaQstnQ] (Question Number). There is one other field [DtaRspns] that
contains the participant actual response.

I've written VBA code that compares the first entry to the second
entry, outputting discrepancies to a table. These errors can be easily
printed in order to hand-check against the hard-copy. However, none of
the ways I can think of to do automatic identification of errors as
the data is being 2nd entered make me very happy. The best might be
using a DAO.recordset and the FindFirst command to compare the 2nd to
the 1st. If there is a discrepancy this code would update a "flag"
field. Then these errors could be filtered and corrected.

On the other hand, it seems that it might be easier to "de-Normalize"
the table slightly to have a [DtaRspn1] and a [DtaRspn2] field. These
fields could be compared easier and possibly faster than the FindFirst
method suggested above.

Any thoughts you might have would be much appreciated. I'm hesitant
to break the normalization rule, as in the past this has always come
back to haunt me. . . .

Thanks,
Paul

May 29 '07 #1

Subscribe Post Reply

6868

absolutejunkforjeremy

On May 29, 3:34 pm, "hippome...@googlemail.com"
<hippome...@googlemail.comwrote:

Hello everyone,

I'm looking for some advice on whether I should break the
normalization rule. Normally, I would not consider it, but this seems
to be a special case.

I have created an "Outcomes Database" used to store response data
from measures/ questionnaires for a longitudinal health study. It is
essentially derived from Duane Hookom's Survey Database (thanks
Duane!!!), with many modifications added to fit the needs of my lab.
One of the additions, is the ability to do "Double Data Entry" in
order to verify the data (as required by the grant).

All the data is entered into the "tbl_DateE", with the following 5
Key fields: [DtaPartID] (Participant ID), [DtaQstnE] (evaluation
number), [DtaSrvID] (Survey ID), [DtaEntry] (1st versus 2nd entry),
[DtaQstnQ] (Question Number). There is one other field [DtaRspns] that
contains the participant actual response.

I've written VBA code that compares the first entry to the second
entry, outputting discrepancies to a table. These errors can be easily
printed in order to hand-check against the hard-copy. However, none of
the ways I can think of to do automatic identification of errors as
the data is being 2nd entered make me very happy. The best might be
using a DAO.recordset and the FindFirst command to compare the 2nd to
the 1st. If there is a discrepancy this code would update a "flag"
field. Then these errors could be filtered and corrected.

On the other hand, it seems that it might be easier to "de-Normalize"
the table slightly to have a [DtaRspn1] and a [DtaRspn2] field. These
fields could be compared easier and possibly faster than the FindFirst
method suggested above.

Any thoughts you might have would be much appreciated. I'm hesitant
to break the normalization rule, as in the past this has always come
back to haunt me. . . .

Thanks,
Paul

Paul,

Sounds like a good time to denormalize. In my eyes, these are separate
data points that are part of the same real-world object.

Normalization is really important, and it's good that your tendency is
to normalize, but there are many cases where strict normalization
would lead to a poor representation of what's in the real world, and
your job is to make your database reflect the real world, not to make
your database match up to a set of rules. The rules only exist because
they are usually helpful in doing that modeling. In cases such as this
it's good to question normalization.

JeremyNYC

May 29 '07 #2

hippomedon

On May 29, 3:45 pm, "absolutejunkforjer...@gmail.com"
<absolutejunkforjer...@gmail.comwrote:

On May 29, 3:34 pm, "hippome...@googlemail.com"

<hippome...@googlemail.comwrote:
Hello everyone,

I'm looking for some advice on whether I should break the
normalization rule. Normally, I would not consider it, but this seems
to be a special case.

I have created an "Outcomes Database" used to store response data
from measures/ questionnaires for a longitudinal health study. It is
essentially derived from Duane Hookom's Survey Database (thanks
Duane!!!), with many modifications added to fit the needs of my lab.
One of the additions, is the ability to do "Double Data Entry" in
order to verify the data (as required by the grant).

All the data is entered into the "tbl_DateE", with the following 5
Key fields: [DtaPartID] (Participant ID), [DtaQstnE] (evaluation
number), [DtaSrvID] (Survey ID), [DtaEntry] (1st versus 2nd entry),
[DtaQstnQ] (Question Number). There is one other field [DtaRspns] that
contains the participant actual response.

I've written VBA code that compares the first entry to the second
entry, outputting discrepancies to a table. These errors can be easily
printed in order to hand-check against the hard-copy. However, none of
the ways I can think of to do automatic identification of errors as
the data is being 2nd entered make me very happy. The best might be
using a DAO.recordset and the FindFirst command to compare the 2nd to
the 1st. If there is a discrepancy this code would update a "flag"
field. Then these errors could be filtered and corrected.

On the other hand, it seems that it might be easier to "de-Normalize"
the table slightly to have a [DtaRspn1] and a [DtaRspn2] field. These
fields could be compared easier and possibly faster than the FindFirst
method suggested above.

Any thoughts you might have would be much appreciated. I'm hesitant
to break the normalization rule, as in the past this has always come
back to haunt me. . . .

Thanks,
Paul

Paul,

Sounds like a good time to denormalize. In my eyes, these are separate
data points that are part of the same real-world object.

Normalization is really important, and it's good that your tendency is
to normalize, but there are many cases where strict normalization
would lead to a poor representation of what's in the real world, and
your job is to make your database reflect the real world, not to make
your database match up to a set of rules. The rules only exist because
they are usually helpful in doing that modeling. In cases such as this
it's good to question normalization.

JeremyNYC

Thanks Jeremy. I'm definitely leaning towards de-normalization in this
case.

May 29 '07 #3

Tony Toews [MVP]

"hi********@googlemail.com" <hi********@googlemail.comwrote:

On the other hand, it seems that it might be easier to "de-Normalize"
the table slightly to have a [DtaRspn1] and a [DtaRspn2] field.

This is not denormalizing. You have the same data but entered twice. It's
perfectly legitimate to do this.

At the concept or overview lefel this is similar to storing the cost and price of an
item when you sell the item. After all the cost and price could change moments
after the transaction is entered.

Tony
--
Tony Toews, Microsoft Access MVP
Please respond only in the newsgroups so that others can
read the entire thread of messages.
Microsoft Access Links, Hints, Tips & Accounting Systems at
http://www.granite.ab.ca/accsmstr.htm
Tony's Microsoft Access Blog - http://msmvps.com/blogs/access/

May 29 '07 #4

hippomedon

On May 29, 4:23 pm, "Tony Toews [MVP]" <tto...@telusplanet.netwrote:

"hippome...@googlemail.com" <hippome...@googlemail.comwrote:
On the other hand, it seems that it might be easier to "de-Normalize"
the table slightly to have a [DtaRspn1] and a [DtaRspn2] field.

This is not denormalizing. You have the same data but entered twice. It's
perfectly legitimate to do this.

At the concept or overview lefel this is similar to storing the cost and price of an
item when you sell the item. After all the cost and price could change moments
after the transaction is entered.

Tony
--
Tony Toews, Microsoft Access MVP
Please respond only in the newsgroups so that others can
read the entire thread of messages.
Microsoft Access Links, Hints, Tips & Accounting Systems athttp://www.granite.ab.ca/accsmstr.htm
Tony's Microsoft Access Blog -http://msmvps.com/blogs/access/

Thanks for the clarification Tony. I'm going ahead with it. So far
it's working great!

May 29 '07 #5

David W. Fenton

"hi********@googlemail.com" <hi********@googlemail.comwrote in
news:11**********************@q69g2000hsb.googlegr oups.com:

I have created an "Outcomes Database" used to store response data
from measures/ questionnaires for a longitudinal health study. It
is essentially derived from Duane Hookom's Survey Database (thanks
Duane!!!), with many modifications added to fit the needs of my
lab. One of the additions, is the ability to do "Double Data
Entry" in order to verify the data (as required by the grant).

Do you have to store the results of the double entry? If not, then
just load the existing record as a recordset and use an unbound form
to check the 2nd entry in the unbound form against the stored entry.
You could even use a form bound to a single record and just not bind
the controls.

Obviously, if you have to *store* the 2nd entry, not applicable,
though if you want to check at data entry time, this would work even
with a bound form for the new data (using an in-memory recordset
with the first version of the data).

If you want to check for discrepancies in a batch, I recommend doing
it by generating a series of SQL statements, one field for each,
that checks the values in the two tables against each other, and if
something is found, writes to a table what the discrepancy is.

Both of these ideas are something I've already implemented in other
apps.

--
David W. Fenton http://www.dfenton.com/
usenet at dfenton dot com http://www.dfenton.com/DFA/

May 30 '07 #6

David W. Fenton

"ab*******************@gmail.com" <ab*******************@gmail.com>
wrote in news:11*********************@g4g2000hsf.googlegrou ps.com:

Sounds like a good time to denormalize.

If both versions of the data need to be stored, I see no reason why
it should be considered denormalization.

I don't think I'd consider doing both versions in one record,
though. I think it's better to either have two tables with the same
structure, or use a "narrow" table for the second entry, where you
record in each field the value entered and which field it is, and
which main record it applies to. Or, if the 2nd entry doesn't have
to be recorded, you could record in this table only the data entry
discrepancies.

--
David W. Fenton http://www.dfenton.com/
usenet at dfenton dot com http://www.dfenton.com/DFA/

May 30 '07 #7

David W. Fenton

"Tony Toews [MVP]" <tt****@telusplanet.netwrote in
news:up********************************@4ax.com:

"hi********@googlemail.com" <hi********@googlemail.comwrote:

>On the other hand, it seems that it might be easier to
"de-Normalize"
the table slightly to have a [DtaRspn1] and a [DtaRspn2] field.

This is not denormalizing. You have the same data but entered
twice. It's perfectly legitimate to do this.

At the concept or overview lefel this is similar to storing the
cost and price of an item when you sell the item. After all the
cost and price could change moments after the transaction is
entered.

Yes, but there are different ways to store the data:

1. double the fields in a single record.

2. have two records in a single table, with identical structure.

3. have two separate tables with identical structure.

4. have a main table and then a narrow side table that records only
the discrepancies, one field per record.

The worst of all seems to me to be the 1st choice, which is, I
believe, what the OP is considering.

--
David W. Fenton http://www.dfenton.com/
usenet at dfenton dot com http://www.dfenton.com/DFA/

May 30 '07 #8

hippomedon

On May 29, 8:42 pm, "David W. Fenton" <XXXuse...@dfenton.com.invalid>
wrote:

"Tony Toews [MVP]" <tto...@telusplanet.netwrote innews:up********************************@4ax.com:

"hippome...@googlemail.com" <hippome...@googlemail.comwrote:

On the other hand, it seems that it might be easier to
"de-Normalize"
the table slightly to have a [DtaRspn1] and a [DtaRspn2] field.

This is not denormalizing. You have the same data but entered
twice. It's perfectly legitimate to do this.

At the concept or overview lefel this is similar to storing the
cost and price of an item when you sell the item. After all the
cost and price could change moments after the transaction is
entered.

Yes, but there are different ways to store the data:

1. double the fields in a single record.

2. have two records in a single table, with identical structure.

3. have two separate tables with identical structure.

4. have a main table and then a narrow side table that records only
the discrepancies, one field per record.

The worst of all seems to me to be the 1st choice, which is, I
believe, what the OP is considering.

--
David W. Fenton http://www.dfenton.com/
usenet at dfenton dot com http://www.dfenton.com/DFA/

Thanks David. Unfortunately, I have to store the 2nd entry so the
unbound form method would not work.

Currently the db is set-up with option 2 (two records in same table).
This works very well for validation purposes, when the validation is
done after the entry. The VBA code that I developed easily compares
Entry 1 to Entry 2 and generates an output. However, for instantaneous
validation, it does not work as well. This is the main reason I'm
considering switching to option 1. I think it would allow for an
easier and more instantaneous comparison of entry 2 to entry 1. It
should actually only add 2 fields to each record (one for the
response, and one to record the time of entry).

Why do you think that this option is the worst? The only potential
disadvantage that I can foresee is that if a "record" is deleted
accidentally, it would be easy to miss. However, the database is well
secured from the end-users, and I have queries that check against
missing records prior to exporting to SPSS (since every participant is
supposed to have the same number of questions).

Your further thoughts would be appreciated.

Thanks,
Paul

PS. I've used two tables in the past, but found that having one table
generally worked better (validations were faster, querying easier,
etc).

May 30 '07 #9

David W. Fenton

"hi********@googlemail.com" <hi********@googlemail.comwrote in
news:11**********************@o5g2000hsb.googlegro ups.com:

Currently the db is set-up with option 2 (two records in same
table).
This works very well for validation purposes, when the validation
is done after the entry. The VBA code that I developed easily
compares Entry 1 to Entry 2 and generates an output. However, for
instantaneous validation, it does not work as well. This is the
main reason I'm considering switching to option 1. I think it
would allow for an easier and more instantaneous comparison of
entry 2 to entry 1. It should actually only add 2 fields to each
record (one for the response, and one to record the time of
entry).

I don't see why you can't load the other record in a recordset and
compare the values field by field when they are entered into the
other table. That would be just as easy as comparing to a different
field. You'd do it something like this in the BeforeUpdate of the
control:

If Nz(Me!ControlName) <Nz(rs(Me!ControlName.ControlSource)) Then
...

You could also write a function that uses Screen.ActiveControl to do
it, which makes it easy to assign to all the controls (in the above,
replace Me!ControlName with Screen.ActiveControl).

--
David W. Fenton http://www.dfenton.com/
usenet at dfenton dot com http://www.dfenton.com/DFA/

May 30 '07 #10

DavidB

All the data is entered into the "tbl_DateE", with the following 5

Key fields: [DtaPartID] (Participant ID), [DtaQstnE] (evaluation
number), [DtaSrvID] (Survey ID), [DtaEntry] (1st versus 2nd entry),
[DtaQstnQ] (Question Number). There is one other field [DtaRspns] that
contains the participant actual response.

I have an aside question here. Noting the table structure above, why
is it that people abbreviate field name to the point of
nonrecognition? Take for example the field DtaSrvID in this table.
Why not name it DataSurveyID which is (I ass-u-me) the data element it
contains? It will make going back into the code at a lated point MUCH
easier. I code and support a large number of databases. Some of them
written by other folks who use these abbreviated naming conventions.
There are many times it takes me a lot of time to figure out what the
data elements even are because of this. The fact that the person
asking the original question had to add an explanation of each field
next to the field name kind of makes my point. Is there any actual
advantage to using the abbreviated field names or is it just a throw
back to the early days of geekdom where space was limited akin to the
2 digit year? (Paul, I am not kicking your cat by asking this. Just
something that has been bugging me for a while and figured I would
ask.)

May 30 '07 #11

hippomedon

On May 30, 2:02 pm, DavidB <j...@yahoo.comwrote:

All the data is entered into the "tbl_DateE", with the following 5
Key fields: [DtaPartID] (Participant ID), [DtaQstnE] (evaluation
number), [DtaSrvID] (Survey ID), [DtaEntry] (1st versus 2nd entry),
[DtaQstnQ] (Question Number). There is one other field [DtaRspns] that
contains the participant actual response.

I have an aside question here. Noting the table structure above, why
is it that people abbreviate field name to the point of
nonrecognition? Take for example the field DtaSrvID in this table.
Why not name it DataSurveyID which is (I ass-u-me) the data element it
contains? It will make going back into the code at a lated point MUCH
easier. I code and support a large number of databases. Some of them
written by other folks who use these abbreviated naming conventions.
There are many times it takes me a lot of time to figure out what the
data elements even are because of this. The fact that the person
asking the original question had to add an explanation of each field
next to the field name kind of makes my point. Is there any actual
advantage to using the abbreviated field names or is it just a throw
back to the early days of geekdom where space was limited akin to the
2 digit year? (Paul, I am not kicking your cat by asking this. Just
something that has been bugging me for a while and figured I would
ask.)

Hi DavidB,
Interesting question. For me, it's always finding a balance between
having a variable name that is short and easy to type (for queries and
VBA code) and having one that is too short to be recognizable. With
something like DtaSrvID, I figure that by capitalizing every segment,
it's generally pretty recognizable. (I added clarification, just in
case). I also like have the first part of the variable refer to the
table, thus Dta references the fact that this variable comes from the
Data table.
Paul

May 30 '07 #12

hippomedon

On May 30, 1:44 pm, "David W. Fenton" <XXXuse...@dfenton.com.invalid>
wrote:

"hippome...@googlemail.com" <hippome...@googlemail.comwrote innews:11**********************@o5g2000hsb.googleg roups.com:

Currently the db is set-up with option 2 (two records in same
table).
This works very well for validation purposes, when the validation
is done after the entry. The VBA code that I developed easily
compares Entry 1 to Entry 2 and generates an output. However, for
instantaneous validation, it does not work as well. This is the
main reason I'm considering switching to option 1. I think it
would allow for an easier and more instantaneous comparison of
entry 2 to entry 1. It should actually only add 2 fields to each
record (one for the response, and one to record the time of
entry).

I don't see why you can't load the other record in a recordset and
compare the values field by field when they are entered into the
other table. That would be just as easy as comparing to a different
field. You'd do it something like this in the BeforeUpdate of the
control:

If Nz(Me!ControlName) <Nz(rs(Me!ControlName.ControlSource)) Then
...

You could also write a function that uses Screen.ActiveControl to do
it, which makes it easy to assign to all the controls (in the above,
replace Me!ControlName with Screen.ActiveControl).

--
David W. Fenton http://www.dfenton.com/
usenet at dfenton dot com http://www.dfenton.com/DFA/

This certainly works, although I find that it's slightly slower than
comparing to another field. I'm curious about your concerns for having
two fields, instead of two tables.

My big problem with this method is making it aesthetically pleasing
and/or easy to use for the end user. I'll try to explain the process a
little better to show what I mean. Typically, it's fastest to enter
the 500 or so responses at each evaluation in one shot. The validation
is happening either silently in the background or done immediately
after the data has been 2nd entered (by pressing a command button, or
whatever). Then, the person doing 2nd entry is supposed to fix the
errors. Since the errors can happen in either first entry or second,
it seems easiest and most intuitive to have a 3rd "Validation" tab on
the main entry form with a subform filtered to shows only the
discrepancies. (Tabs 1 and 2 are for first and 2nd entry,
respectively). Entry 1 would be right next Entry 2, the person doing
data entry checks the hard copy and fixes whichever entry has the
mistake.

With 1 record storing both entries, the subform for the "validation
tab" is very easy to design. It's very easy to query for the
discrepancies, and easy to have an updateable dataset. With two
tables and/or two records, the "Validation tab" seems to require 2
subforms (one for each entry -- so the two entries line up
horizontally next to each other). Additionally, filtering/ querying
for the discrepancies is harder (although that may just be me).

Please let me know your concerns with this table structure! I think
the intuitively easy-to-use and slightly faster end product, justifies
this structure. I was a bit wary of increasing the number of variables
in the table, but there seems to be no real reason or disadvantage to
this!

Thanks,
Paul

May 30 '07 #13

hippomedon

On May 30, 1:44 pm, "David W. Fenton" <XXXuse...@dfenton.com.invalid>
wrote:

"hippome...@googlemail.com" <hippome...@googlemail.comwrote innews:11**********************@o5g2000hsb.googleg roups.com:

Currently the db is set-up with option 2 (two records in same
table).
This works very well for validation purposes, when the validation
is done after the entry. The VBA code that I developed easily
compares Entry 1 to Entry 2 and generates an output. However, for
instantaneous validation, it does not work as well. This is the
main reason I'm considering switching to option 1. I think it
would allow for an easier and more instantaneous comparison of
entry 2 to entry 1. It should actually only add 2 fields to each
record (one for the response, and one to record the time of
entry).

I don't see why you can't load the other record in a recordset and
compare the values field by field when they are entered into the
other table. That would be just as easy as comparing to a different
field. You'd do it something like this in the BeforeUpdate of the
control:

If Nz(Me!ControlName) <Nz(rs(Me!ControlName.ControlSource)) Then
...

You could also write a function that uses Screen.ActiveControl to do
it, which makes it easy to assign to all the controls (in the above,
replace Me!ControlName with Screen.ActiveControl).

--
David W. Fenton http://www.dfenton.com/
usenet at dfenton dot com http://www.dfenton.com/DFA/

May 30 '07 #14

David W. Fenton

"hi********@googlemail.com" <hi********@googlemail.comwrote in
news:11*********************@k79g2000hse.googlegro ups.com:

On May 30, 1:44 pm, "David W. Fenton"
<XXXuse...@dfenton.com.invalidwrote:
>"hippome...@googlemail.com" <hippome...@googlemail.comwrote
innews:11**********************@o5g2000hsb.google groups.com:

Currently the db is set-up with option 2 (two records in same
table).
This works very well for validation purposes, when the
validation is done after the entry. The VBA code that I
developed easily compares Entry 1 to Entry 2 and generates an
output. However, for instantaneous validation, it does not work
as well. This is the main reason I'm considering switching to
option 1. I think it would allow for an easier and more
instantaneous comparison of entry 2 to entry 1. It should
actually only add 2 fields to each record (one for the
response, and one to record the time of entry).

I don't see why you can't load the other record in a recordset
and compare the values field by field when they are entered into
the other table. That would be just as easy as comparing to a
different field. You'd do it something like this in the
BeforeUpdate of the control:

If Nz(Me!ControlName) <Nz(rs(Me!ControlName.ControlSource))
Then
...

You could also write a function that uses Screen.ActiveControl to
do it, which makes it easy to assign to all the controls (in the
above, replace Me!ControlName with Screen.ActiveControl).

--
David W. Fenton http://www.dfenton.com/
usenet at dfenton dot com http://www.dfenton.com/DFA/

This certainly works, although I find that it's slightly slower
than comparing to another field.

Why? The only difference would be the time it takes to open the
recordset on the existing record, and that's going to be very small
(unless you do it for each field, which makes no sense -- instead
you'd do it once for the whole record, as soon as it's been
determined what the matching record is).

I'm curious about your concerns for having
two fields, instead of two tables.

Or two records in the same table.

It's more normalized and I see no advantage whatsoever in using two
fields in the same record. I like the elegance of using the same
field name to compare, as opposed to lopping off the last character
and using a new last character.

My big problem with this method is making it aesthetically
pleasing
and/or easy to use for the end user. I'll try to explain the
process a little better to show what I mean. Typically, it's
fastest to enter the 500 or so responses at each evaluation in one
shot. The validation is happening either silently in the
background or done immediately after the data has been 2nd entered
(by pressing a command button, or whatever). Then, the person
doing 2nd entry is supposed to fix the errors. Since the errors
can happen in either first entry or second, it seems easiest and
most intuitive to have a 3rd "Validation" tab on the main entry
form with a subform filtered to shows only the discrepancies.
(Tabs 1 and 2 are for first and 2nd entry, respectively). Entry 1
would be right next Entry 2, the person doing data entry checks
the hard copy and fixes whichever entry has the mistake.

With 1 record storing both entries, the subform for the
"validation
tab" is very easy to design. It's very easy to query for the
discrepancies, and easy to have an updateable dataset. With two
tables and/or two records, the "Validation tab" seems to require 2
subforms (one for each entry -- so the two entries line up
horizontally next to each other). Additionally, filtering/
querying for the discrepancies is harder (although that may just
be me).

I don't see how it's harder. You can get them into a single row with
nothing more than joining the two tables togetner and then referring
to the field names fully specified with table name.

Please let me know your concerns with this table structure! I
think the intuitively easy-to-use and slightly faster end product,
justifies this structure. I was a bit wary of increasing the
number of variables in the table, but there seems to be no real
reason or disadvantage to this!

Well, that raises the question of whether it's correct to structure
a survey with many fields in one record, or if each survey question
should be its own record. I definitely lean towards the latter, as
it makes a whole host of things much, much easier.

--
David W. Fenton http://www.dfenton.com/
usenet at dfenton dot com http://www.dfenton.com/DFA/

May 31 '07 #15

David W. Fenton

"hi********@googlemail.com" <hi********@googlemail.comwrote in
news:11**********************@m36g2000hse.googlegr oups.com:

Interesting question. For me, it's always finding a balance
between
having a variable name that is short and easy to type (for queries
and VBA code) and having one that is too short to be recognizable.

Intellisense removes the need for most of the typing.

--
David W. Fenton http://www.dfenton.com/
usenet at dfenton dot com http://www.dfenton.com/DFA/

May 31 '07 #16

hippomedon

Intellisense removes the need for most of the typing.

But not all :-)

Paul

Jun 1 '07 #17

hippomedon

Why? The only difference would be the time it takes to open the
recordset on the existing record, and that's going to be very small
(unless you do it for each field, which makes no sense -- instead
you'd do it once for the whole record, as soon as it's been
determined what the matching record is).

I tried this, but it's slightly slower. Perhaps, it's because the
backend is on a network drive and because the tables can have many
hundreds of thousands of questions. I've found that the network drive
in particular slows down opening recordsets. However, with a slightly
broader defined recordset opened once for a participat's entire
evaluation (thus about 500 records -- 1 per question) and the
FindFirst method, the speed difference disappears.

It's more normalized and I see no advantage whatsoever in using two
fields in the same record. I like the elegance of using the same
field name to compare, as opposed to lopping off the last character
and using a new last character.

It is a bit more normalized and elegant. However, I find it a little
more awkward to use. . . Probably just me.

I don't see how it's harder. You can get them into a single row with
nothing more than joining the two tables togetner and then referring
to the field names fully specified with table name.

Joins do well well with two tables. Sorry, I was thinking of two
records in one table.

Well, that raises the question of whether it's correct to structure
a survey with many fields in one record, or if each survey question
should be its own record. I definitely lean towards the latter, as
it makes a whole host of things much, much easier.

Really? I don't believe that should even be a question! Normalizing to
the question level does make everything much, much easier. . . To
anyone reading this post for advice, please, please follow David's
advice and normalize any survey database to the question level (Duane
Hookom, MVP, provides a nice model as a jump point for those starting
off). I have heard some statisticians complain about this, because
they need the data in multivariate form for analysis, but crosstab
queries should always be able to allay their fears.

The issue at hand is whether it's okay to have a second variable for
the second entry in the table that is already normalized to the
question level. Of the three main options summarized above, I have
found that having two records in the same table is the most cumbersome
for validation. It's only advantage might come in if you need to enter
the exact same data more than twice (although I've never heard of such
a thing). Having two tables, one for first entry and one for second,
worked very well, but was slightly more awkward (for me). It is,
however, a little more elegant and normalized. I've found that having
two "response" variables, one for first entry and one for second entry
has worked the best for me (in terms of ease of set-up). Since I only
use the 2nd entry for validation, I doubt that this slightly less
normalized structure will pose any problems:)

Thanks a lot David for your advice and suggestions. They helped prompt
some interesting tests, some of which have proven very useful (e.g.,
despite working intensively with VBA for over a year, for some reason
Screen.ActiveControl was not mentioned in my books. . . Very, very
useful tip).

Paul

Jun 1 '07 #18

David W. Fenton

"hi********@googlemail.com" <hi********@googlemail.comwrote in
news:11**********************@k79g2000hse.googlegr oups.com:

>Intellisense removes the need for most of the typing.

But not all :-)

The only situation I can think of where you don't get some form of
autocompletion is in the QBE when you've aliased a field.

--
David W. Fenton http://www.dfenton.com/
usenet at dfenton dot com http://www.dfenton.com/DFA/

Jun 1 '07 #19

David W. Fenton

"hi********@googlemail.com" <hi********@googlemail.comwrote in
news:11**********************@m36g2000hse.googlegr oups.com:

[quoting me:]

>Why? The only difference would be the time it takes to open the
recordset on the existing record, and that's going to be very
small (unless you do it for each field, which makes no sense --
instead you'd do it once for the whole record, as soon as it's
been determined what the matching record is).

I tried this,

Tried what? How did you do the validation? Field by field, or in a
batch, going through all the fields at once?

but it's slightly slower. Perhaps, it's because the
backend is on a network drive and because the tables can have many
hundreds of thousands of questions.

This shouldn't cause any noticeable difference if you've properly
indexed your fields.

I've found that the network drive
in particular slows down opening recordsets.

I haven't seen this as any particular problem, except for recordsets
with complex joins.

However, with a slightly
broader defined recordset opened once for a participat's entire
evaluation (thus about 500 records -- 1 per question) and the
FindFirst method, the speed difference disappears.

OK. As I suspected, the issue was your *method* for finding the
information.

[]

>I don't see how it's harder. You can get them into a single row
with nothing more than joining the two tables togetner and then
referring to the field names fully specified with table name.

Joins do well well with two tables. Sorry, I was thinking of two
records in one table.

That should work fine in one table, too. Why would it not?

>Well, that raises the question of whether it's correct to
structure a survey with many fields in one record, or if each
survey question should be its own record. I definitely lean
towards the latter, as it makes a whole host of things much, much
easier.

Really? I don't believe that should even be a question!
Normalizing to the question level does make everything much, much
easier. . . To anyone reading this post for advice, please, please
follow David's advice and normalize any survey database to the
question level (Duane Hookom, MVP, provides a nice model as a jump
point for those starting off). I have heard some statisticians
complain about this, because they need the data in multivariate
form for analysis, but crosstab queries should always be able to
allay their fears.

I didn't realize until your comment above about 500 questions that
you were normalized to the question level.

The issue at hand is whether it's okay to have a second variable
for the second entry in the table that is already normalized to
the question level. Of the three main options summarized above, I
have found that having two records in the same table is the most
cumbersome for validation.

Well, I'm less opposed to it now that you say that you are
normalized to the question level. Dunno if you've got a single text
field that you coerce to the correct data type for evaluation or if
you've got different fields for each data type (a text field, a
Boolean field, a Double). If the latter, then it would bother me to
double the number of fields. If the former, it wouldn't bother me
too much at all.

It's only advantage might come in if you need to enter
the exact same data more than twice (although I've never heard of
such a thing). Having two tables, one for first entry and one for
second, worked very well, but was slightly more awkward (for me).
It is, however, a little more elegant and normalized. I've found
that having two "response" variables, one for first entry and one
for second entry has worked the best for me (in terms of ease of
set-up). Since I only use the 2nd entry for validation, I doubt
that this slightly less normalized structure will pose any
problems:)

Thanks a lot David for your advice and suggestions. They helped
prompt some interesting tests, some of which have proven very
useful (e.g., despite working intensively with VBA for over a
year, for some reason Screen.ActiveControl was not mentioned in my
books. . . Very, very useful tip).

I'm still discovering new things and I've been programming Access
for a living since 1996.

--
David W. Fenton http://www.dfenton.com/
usenet at dfenton dot com http://www.dfenton.com/DFA/

Jun 1 '07 #20

hippomedon

Thanks David. It sounds like we are finally on the same page:-) I have
a single text field that I coerce into the appropriate data type when
I export to SAS/ SPSS. Thus, as said above, I really am only adding a
single field to the table.

Jun 4 '07 #21

Microsoft Access, Double Data Entry and breaking the Normalization rule

Similar topics