By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,454 Members | 3,133 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,454 IT Pros & Developers. It's quick & easy.

Philosophical issue - problems of distinction between code and data

P: n/a
Hi folks,

I'm posting this message because it's an issue I come up against relatively
often, but I can't find any writings on the subject, and I haven't been able
to figure out even what key words one would use to look for it.

First, in broad philosophical terms, code actually -is- data. Code is the
data that's fed into a compiler, interpreter, or microprocessor that tells it
what to do. Code execution is, then, just a form another of data processing
handled by other code or hardware operating at the next lower level of
abstraction.

Now, in more concrete, database terms...

I recently came up with a scheme for representing how the names of various
lookup item names change, merge, and split over time in a set of data imported
periodically from an Excel spreadsheet exported form another application. The
troubles are, fist, that the scheme requires a large number of tables for each
lookup table to represent the mappings, and second, that there are a large
number of groups of tables that must have essentially the same structure.
Both of these are maintenance nightmares and represent a form of code
duplciation.

The next idea I had brings us back to the topic of this message. I thought,
what if I use just one table for all the lookup, and a type code to say what
kind of lookup value it contains? Now, we end up using the database in a way
it was not intended. We now have a type code which represnts data in one
sense, but is in another sense, part of the schema. This problem becomes
materialized when you try to define relationships, because now, with just one
lookup table for multiple types of lookup, there's no way to define the
relations correctly.

In this case, I ended up solving the problem by having both general and
specific lookup tables. The general table contains all lookup text and IDs,
and has a type code. Each specific table is a 1-to- with the general lookup
table, sharing the same ID, and also has the type code, but limits the allowed
value of the type code to the one code applicable to that table's type (it has
no text because that's available from the general lookup table). Now, a
compound relationship is created from the general lookup table to all the
specific lookup tables on both ID, and type code. The magic is that, now we
have specific lookup tables that we can set up proper relationships to, and a
central table containing all lookup IDs and text values that can be managed as
a single set, plus, the model enforces that both ways of identifying a record
type will agree.

This is a specific solution to a specific problem I may never have to deal
with again, but this class of problem shows up in different disguses all the
time. An example is, in VB when you want a group of items to be part of an
array or collection, but also refer to them as named properties. VB has no
built-in mechanism for this, so you have to manually maintain the correct
mapping between the 2 or use some kind of code generator to ensure that the
resulting code is consistent. In .NET, this particular problem is solved
using code attributes and introspection, so obviously I'm not the only one
thinking about this class of problem.

Does anyone know if there is a name for a topic that deals with the whole
class of issues around the ambiguity between code and data that we run into
when trying to improve the abstractions in our programs?
Nov 12 '05 #1
Share this Question
Share on Google+
8 Replies


P: n/a
My level of expertise is definitely insufficient to follow the entire
thread of thought here but isn't OOP exactly the approach for dealing
with recycling the code and attaching it to various data assemblies? I
have been able to use class modules for every programming need I had so
far. Indeed, if you try to use only built in Access classes (like
Tables, which, if unchecked, eventually start to multiply like
cockroaches) you run into limitations with regards to what you can do
but I have yet to see a problem I couldn't solve using custom class modules.

Pavel

Steve Jorgensen wrote:

Hi folks,

I'm posting this message because it's an issue I come up against relatively
often, but I can't find any writings on the subject, and I haven't been able
to figure out even what key words one would use to look for it.

First, in broad philosophical terms, code actually -is- data. Code is the
data that's fed into a compiler, interpreter, or microprocessor that tells it
what to do. Code execution is, then, just a form another of data processing
handled by other code or hardware operating at the next lower level of
abstraction.

Now, in more concrete, database terms...

I recently came up with a scheme for representing how the names of various
lookup item names change, merge, and split over time in a set of data imported
periodically from an Excel spreadsheet exported form another application. The
troubles are, fist, that the scheme requires a large number of tables for each
lookup table to represent the mappings, and second, that there are a large
number of groups of tables that must have essentially the same structure.
Both of these are maintenance nightmares and represent a form of code
duplciation.

The next idea I had brings us back to the topic of this message. I thought,
what if I use just one table for all the lookup, and a type code to say what
kind of lookup value it contains? Now, we end up using the database in a way
it was not intended. We now have a type code which represnts data in one
sense, but is in another sense, part of the schema. This problem becomes
materialized when you try to define relationships, because now, with just one
lookup table for multiple types of lookup, there's no way to define the
relations correctly.

In this case, I ended up solving the problem by having both general and
specific lookup tables. The general table contains all lookup text and IDs,
and has a type code. Each specific table is a 1-to- with the general lookup
table, sharing the same ID, and also has the type code, but limits the allowed
value of the type code to the one code applicable to that table's type (it has
no text because that's available from the general lookup table). Now, a
compound relationship is created from the general lookup table to all the
specific lookup tables on both ID, and type code. The magic is that, now we
have specific lookup tables that we can set up proper relationships to, and a
central table containing all lookup IDs and text values that can be managed as
a single set, plus, the model enforces that both ways of identifying a record
type will agree.

This is a specific solution to a specific problem I may never have to deal
with again, but this class of problem shows up in different disguses all the
time. An example is, in VB when you want a group of items to be part of an
array or collection, but also refer to them as named properties. VB has no
built-in mechanism for this, so you have to manually maintain the correct
mapping between the 2 or use some kind of code generator to ensure that the
resulting code is consistent. In .NET, this particular problem is solved
using code attributes and introspection, so obviously I'm not the only one
thinking about this class of problem.

Does anyone know if there is a name for a topic that deals with the whole
class of issues around the ambiguity between code and data that we run into
when trying to improve the abstractions in our programs?

Nov 12 '05 #2

P: n/a
no****@nospam.nospam (Steve Jorgensen) wrote in
<d1********************************@4ax.com>:
Does anyone know if there is a name for a topic that deals with
the whole class of issues around the ambiguity between code and
data that we run into when trying to improve the abstractions in
our programs?


I can't answer your real question, but it seems to me that you've
figured out a scenario where you are happy with the fundamental
idea behind my generic lookup table:

http://www.bway.net/~dfassoc/downloa...okupAdmin.html

I was thinking about this issue the other day and I generally think
that I'm not so concerned about enforcing referential integrity on
these kinds of lookups. Really. Indeed, I've not had a single
application where use of this generic lookup structure has resulted
in incorrect data being allowed into the lookup fields. Maybe this
is because I'm the only one writing apps against these
applications. Or maybe it's because my users never feel any need or
desire to edit tables directly because my apps give them all the
functionality they need.

In your scenario, I don't see any reason why you'd be bothered by
the structure at all. Groups of entities that can be modelled in
identical data structures are, at some level, identical entities.
There is an isomorphism that you can exploit to simplify a data
structure.

I think the reason why this is surprising or feels inappropriate is
that we too often confuse the entites in our data schema with the
real entities.

I long ago stopped using more than one table per application for
storing data about people, even though those tables are actually
holding more than one functional entity type (customer contacts,
employees, etc.). This has caused no problems at all, and has
vastly simplified any number of issues (e.g., providing search
functionality for multiple entity types).

I can't see a downside to your approach. Can you?

--
David W. Fenton http://www.bway.net/~dfenton
dfenton at bway dot net http://www.bway.net/~dfassoc
Nov 12 '05 #3

P: n/a
On Wed, 07 Jan 2004 20:21:23 GMT, dX********@bway.net.invalid (David W.
Fenton) wrote:
no****@nospam.nospam (Steve Jorgensen) wrote in
<d1********************************@4ax.com>:
Does anyone know if there is a name for a topic that deals with
the whole class of issues around the ambiguity between code and
data that we run into when trying to improve the abstractions in
our programs?
I can't answer your real question, but it seems to me that you've
figured out a scenario where you are happy with the fundamental
idea behind my generic lookup table:

http://www.bway.net/~dfassoc/downloa...okupAdmin.html


Yes. In a sense, I've actually worked around a large part of the limitation
of that approach, but my work-around also reintroduces the clutter you were
trying to eliminate when you originally proposed it.
I was thinking about this issue the other day and I generally think
that I'm not so concerned about enforcing referential integrity on
these kinds of lookups. Really. Indeed, I've not had a single
application where use of this generic lookup structure has resulted
in incorrect data being allowed into the lookup fields. Maybe this
is because I'm the only one writing apps against these
applications. Or maybe it's because my users never feel any need or
desire to edit tables directly because my apps give them all the
functionality they need.
As you say on the Web page, you were dealing with small lookup tables,
unlikely to need to encompass additional functionality. I'm not sure I would
have chosen to use that solution for that case, but I wouldn't say it was a
wrong choice either. Kind of a 6 of 1, 1/2 dozen of the other sort of thing.
In your scenario, I don't see any reason why you'd be bothered by
the structure at all. Groups of entities that can be modelled in
identical data structures are, at some level, identical entities.
There is an isomorphism that you can exploit to simplify a data
structure.
In analyzing whether to do it or not, I came up with lots of reasons to be
bothered by the structure, and my final, general/specific hybrid was a way to
mitigate those factors.

The main bothersome thing about this case is that the criteria for needing
central management was not the fact that these were mostly simple lookup
tables, but the fact that they all needed to be managed by the time-delta
management code. In fact, it turned out pretty quickly that one of the tables
involved would -not- be simpy a lookup table, but a table with about 30
fields. By having the central table for the time-delta-manageable instance of
the entity, and then having an indepentent table representation of each entity
type, I not only have a way to use proper DRI, but a place to put additional
columns relating only to specific entity types.

In more general terms, you could say I had an OOP proplem in a SQL database
schema. One entity is very different things in different contexts, and things
A, B, C, etc are all also thing Q even though A, B, and C are all very
different from each other.
I think the reason why this is surprising or feels inappropriate is
that we too often confuse the entites in our data schema with the
real entities.

I long ago stopped using more than one table per application for
storing data about people, even though those tables are actually
holding more than one functional entity type (customer contacts,
employees, etc.). This has caused no problems at all, and has
vastly simplified any number of issues (e.g., providing search
functionality for multiple entity types).
I think I agree with your answer, but not with your analysis. I say People
are really a single set, and customers, contacts, employees, etc., are roles
that people can have. To me, it seems like People are the same entity type at
both levels, and logically belong in the same set, not just the same table.
Wouldn't it be a mutable business rule whether a single Person record should
be allowed represent be both an Employee and a Customer (should attribure
changes on a sinlge person be reflected in both places)? If so, the design
probably should not treat them as part of different sets in any sense.
I can't see a downside to your approach. Can you?


At least 4 downsides remain (all fairly minor):

1. Each entity's name must have the same type, maximum size, and data entry
constraints.
2. It is not possible to use the entity's name as part of a unique
contstraint involving other fields, contained in the specific entity table
since the name is in the general-purpose table.
3. Everywhere in code that I add a row to any of these tables, I now have to
add the record to 2 tables, the general and the specific. Likewise, when I
delete, I need to make sure to delete the specific first, then the general
(can use cascade delete, but still need to make sure to delete from the
general table).
4. DRI ensures that I can't have a specific record without the general, but
it's still possible to have a general record without the specific, even though
that's an inconsistent state from the application point of view.
Nov 12 '05 #4

P: n/a
On Wed, 07 Jan 2004 09:35:25 -0700, Pavel Romashkin
<pa*************@hotmail.com> wrote:
My level of expertise is definitely insufficient to follow the entire
thread of thought here but isn't OOP exactly the approach for dealing
with recycling the code and attaching it to various data assemblies? I
have been able to use class modules for every programming need I had so
far. Indeed, if you try to use only built in Access classes (like
Tables, which, if unchecked, eventually start to multiply like
cockroaches) you run into limitations with regards to what you can do
but I have yet to see a problem I couldn't solve using custom class modules.


After my reply to David, I realize that you are exactly right. OOP does solve
a large number of problems in this category, though clearly not all of them.
In this case, I was actually solving a polymorphism problem, but had to do it
in the context of a relational database schema where OOP constructs are not
available.
Nov 12 '05 #5

P: n/a
On Wed, 07 Jan 2004 23:02:40 GMT, Steve Jorgensen
<no****@nospam.nospam> wrote in comp.databases.ms-access:
After my reply to David, I realize that you are exactly right. OOP does solve
a large number of problems in this category, though clearly not all of them.
In this case, I was actually solving a polymorphism problem, but had to do it
in the context of a relational database schema where OOP constructs are not
available.


Steve, perhaps its time for you to look into an object-relational
database systems...

Peter Miller
__________________________________________________ __________
PK Solutions -- Data Recovery for Microsoft Access/Jet/SQL
Free quotes, Guaranteed lowest prices and best results
www.pksolutions.com 1.866.FILE.FIX 1.760.476.9051
Nov 12 '05 #6

P: n/a

"Steve Jorgensen" <no****@nospam.nospam> wrote in message
news:d1********************************@4ax.com...
Hi folks,

I'm posting this message because it's an issue I come up against relatively often, but I can't find any writings on the subject, and I haven't been able to figure out even what key words one would use to look for it.

First, in broad philosophical terms, code actually -is- data. Code is the
data that's fed into a compiler, interpreter, or microprocessor that tells it what to do. Code execution is, then, just a form another of data processing handled by other code or hardware operating at the next lower level of
abstraction.

Now, in more concrete, database terms...

I recently came up with a scheme for representing how the names of various
lookup item names change, merge, and split over time in a set of data imported periodically from an Excel spreadsheet exported form another application. The troubles are, fist, that the scheme requires a large number of tables for each lookup table to represent the mappings, and second, that there are a large
number of groups of tables that must have essentially the same structure.
Both of these are maintenance nightmares and represent a form of code
duplciation.

The next idea I had brings us back to the topic of this message. I thought, what if I use just one table for all the lookup, and a type code to say what kind of lookup value it contains? Now, we end up using the database in a way it was not intended. We now have a type code which represnts data in one
sense, but is in another sense, part of the schema. This problem becomes
materialized when you try to define relationships, because now, with just one lookup table for multiple types of lookup, there's no way to define the
relations correctly.

In this case, I ended up solving the problem by having both general and
specific lookup tables. The general table contains all lookup text and IDs, and has a type code. Each specific table is a 1-to- with the general lookup table, sharing the same ID, and also has the type code, but limits the allowed value of the type code to the one code applicable to that table's type (it has no text because that's available from the general lookup table). Now, a
compound relationship is created from the general lookup table to all the
specific lookup tables on both ID, and type code. The magic is that, now we have specific lookup tables that we can set up proper relationships to, and a central table containing all lookup IDs and text values that can be managed as a single set, plus, the model enforces that both ways of identifying a record type will agree.

This is a specific solution to a specific problem I may never have to deal
with again, but this class of problem shows up in different disguses all the time. An example is, in VB when you want a group of items to be part of an array or collection, but also refer to them as named properties. VB has no built-in mechanism for this, so you have to manually maintain the correct
mapping between the 2 or use some kind of code generator to ensure that the resulting code is consistent. In .NET, this particular problem is solved
using code attributes and introspection, so obviously I'm not the only one
thinking about this class of problem.

Does anyone know if there is a name for a topic that deals with the whole
class of issues around the ambiguity between code and data that we run into when trying to improve the abstractions in our programs?

Thanks for the update on medical marijuana use!
Nov 12 '05 #7

P: n/a
On Thu, 08 Jan 2004 00:31:51 GMT, Peter Miller <pm*****@pksolutions.com>
wrote:
On Wed, 07 Jan 2004 23:02:40 GMT, Steve Jorgensen
<no****@nospam.nospam> wrote in comp.databases.ms-access:
After my reply to David, I realize that you are exactly right. OOP does solve
a large number of problems in this category, though clearly not all of them.
In this case, I was actually solving a polymorphism problem, but had to do it
in the context of a relational database schema where OOP constructs are not
available.


Steve, perhaps its time for you to look into an object-relational
database systems...


Yeah, I'm starting to realize that, but in this case, there was exactly one
type of polymorphism problem and I found a solution. My post here was really
a question to see if anyone knew of resources or search terms about the
broader topic of data vs code represenation of rules and structure in
applications, and ways to handle cases where some aspect of the app is better
handled as code or schema structure in for some purposes, and as data for
other purposes.

Now that I look at it, though, I think I got it. Most of what I'm looking at
is really in the category of polymorphism, and OO is the only model we really
have for dealing with that.
Nov 12 '05 #8

P: n/a
"XMVP" <ac***********@hotmail.com> wrote
Thanks for the update on medical marijuana use!


Just as expected from an access_moron. Better go spend some time at
http://www.donny.com and get your head straight.

XDPM
Nov 12 '05 #9

This discussion thread is closed

Replies have been disabled for this discussion.