good table design...

toedipper

Hello,

I am designing a table of vehicle types, nothing special, just a list of
unique vehicle types such as truck, lorry, bike, motor bike, plane, tractor
etc etc

For the table design I am proposing a single column table with a field name
called vehicle_type and this will contain the vehicle type.

Sot it will be

vehicle_type
car
bike
tractor
plane
truck
van
blah
blah
blah

Is this ok? Or is there a better way to do it?

Thanks,

td.

Jul 20 '05 #1

Subscribe Post Reply

4606

--CELKO--

>> For the table design I am proposing a single column table with a
field [sic] name called vehicle_type and this will contain the vehicle
type... Is this ok? <<

Sure, if you remember to make the one column the primary key. It is
weird just hanging out there in space without anything in the schema,
but it is legal. And fields are not anything like column.

Jul 20 '05 #2

toedipper

Thanks for your reply.

Totally of topic but what does 'sic' mean? I see this all over the place
both on the web and in print.

Thanks,

TD.
"--CELKO--" <jc*******@earthlink.net> wrote in message
news:18**************************@posting.google.c om...

For the table design I am proposing a single column table with a

field [sic] name called vehicle_type and this will contain the vehicle
type... Is this ok? <<

Sure, if you remember to make the one column the primary key. It is
weird just hanging out there in space without anything in the schema,
but it is legal. And fields are not anything like column.

Jul 20 '05 #3

David Portas

Best way to describe a table is as DDL:

CREATE TABLE Vehicles (vehicle_type VARCHAR(20) PRIMARY KEY)

Nothing wrong with this on its own but you need to consider its place in the
schema as a whole rather than in isolation. Vehicle_type may not necessarily
be appropriate or convenient as a foreign key in other tables.

--
David Portas
SQL Server MVP
--

Jul 20 '05 #4

David Portas

I recommend you invest in a dictionary... or Google for one :-)

In SQL we have columns and rows NOT fields and records. Conceptually they
are quite different.

--
David Portas
SQL Server MVP
--

Jul 20 '05 #5

Hilarion

> In SQL we have columns and rows NOT fields and records. Conceptually they are quite different.

Could you point the differences, please?
I could swear that in relational DB terminology fields and records are
common. SQL DBs are relational DBs.

Hilarion

Jul 20 '05 #6

David Portas

> Could you point the differences, please?

http://www.google.com/groups?selm=e%...phx.gbl&rnum=6

SQL DBs are relational DBs.

I'll watch out for that one to be quoted on www.dbdebunk.com ;-)

--
David Portas
SQL Server MVP
--

Jul 20 '05 #7

Joe Celko

why do you kids know all the internet codes and emoticons, but not
Latin? "sic" means "error in the original"; look up etc., et al, ibid,
e.g., i.e. and so forth.

Better yet, look up the telegram codes.

--CELKO--
Please post DDL, so that people do not have to guess what the keys,
constraints, Declarative Referential Integrity, datatypes, etc. in your
schema are. Sample data is also a good idea, along with clear
specifications.
*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!

Jul 20 '05 #8

Erland Sommarskog

toedipper (se******************@hotmail.com) writes:

I am designing a table of vehicle types, nothing special, just a list of
unique vehicle types such as truck, lorry, bike, motor bike, plane,
tractor

For the table design I am proposing a single column table with a field
name called vehicle_type and this will contain the vehicle type.

Normally you would use this value somewhere. In that case it's usually
a bit akward to have the string "tractor" all over the place. So rather
you would have things like:

CREATE TABLE vehicle_types (vtypid smallint NOT NULL,
vtypname varchar(20) NOT NULL,
CONSTRAINT pk_vtype PRIMARY KEY (vtypid))

CREATE TABLE vehicles (vehicleid int NOT NULL,
...
vehicle_type smallint NOT NULL,
...
CONSTRAINT fk_vehicle_type FOREIGN KEY (vehicle_type)
REFERNENCES vehicle_types (vtypid))

If your code includes hard coded tests for various vehicle types, a short
char field for a mnemonic code may be better.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Jul 20 '05 #9

Joe Celko

Like most new ideas, the hard part of understanding what the relational
model is comes in un-learning what you know about file systems.

As Artemus Ward (William Graham Sumner, 1840-1910) put it, "It ain't so
much the things we don't know that get us into trouble. It's the things
we know that just ain't so."

If you already have a background in data processing with traditional
file systems, the first things to un-learn are:

(0) Databases are not file sets.
(1) Tables are not files.
(2) Rows are not records.
(3) Columns are not fields.

Modern data processing began with punch cards, or Hollerith cards used
by the Bureau of the Census. Their original size was that of a United
States Dollar bill. This was set by their inventor, Herman Hollerith,
because he could get furniture to store the cards from the United States
Treasury Department, just across the street. Likewise, physical
constraints limited each card to 80 columns of holes in which to record
a symbol.

The influence of the punch card lingered on long after the invention of
magnetic tapes and disk for data storage. This is why early video
display terminals were 80 columns across. Even today, files which were
migrated from cards to magnetic tape files or disk storage still use 80
column records.

But the influence was not just on the physical side of data processing.
The methods for handling data from the prior media were imitated in the
new media.

Data processing first consisted of sorting and merging decks of punch
cards (later, sequential magnetic tape files) in a series of distinct
steps. The result of each step feed into the next step in the process.

Relational databases do not work that way. Each user connects to the
entire database all at once, not to one file at time in a sequence of
steps. The users might not all have the same database access rights
once they are connected, however. Magnetic tapes could not be shared
among users at the same time, but shared data is the point of a
database.

Tables versus Files

A file is closely related to its physical storage media. A table may or
may not be a physical file. DB2 from IBM uses one file per table,
while Sybase puts several entire databases inside one file. A table is
a set of rows of the same kind of thing. A set has no ordering
and it makes no sense to ask for the first or last row.

A deck of punch cards is sequential, and so are magnetic tape files.
Therefore, a physical file of ordered sequential records also
became the mental model for data processing and it is still hard
to shake. Anytime you look at data, it is in some physical ordering.

The various access methods for disk storage system came later, but even
these access methods could not shake the mental model.

Another conceptual difference is that a file is usually data that deals
with a whole business process. A file has to have enough data in
itself to support applications for that business process. Files tend to
be "mixed" data which can be described by the name of the business
process, such as "The Payroll file" or something like that.

Tables can be either entities or relationships within a business
process. This means that the data which was held in one file is often
put into several tables. Tables tend to be "pure" data which can be
described by single words. The payroll would now have separate tables
for timecards, employees, projects and so forth.

Tables as Entities

An entity is physical or conceptual "thing" which has meaning be itself.
A person, a sale or a product would be an example. In a relational
database, an entity is defined by its attributes, which are shown as
values in columns in rows in a table.

To remind users that tables are sets of entities, I like to use plural
or collective nouns that describe the function of the entities within
the system for the names of tables. Thus "Employee" is a bad name
because it is singular; "Employees" is a better name because it is
plural; "Personnel" is best because it is collective and does not summon
up a mental picture of individual persons.

If you have tables with exactly the same structure, then they are sets
of the same kind of elements. But you should have only one set for each
kind of data element! Files, on the other hand, were PHYSICALLY
separate units of storage which could be alike -- each tape or disk file
represents a step in the PROCEDURE , such as moving from raw data, to
edited data, and finally to archived data.

Tables as Relationships

A relationship is shown in a table by columns which reference one or
more entity tables. Without the entities, the relationship has no
meaning, but the relationship can have attributes of its own. For
example, a show business contract might have an agent, an employer and a
talent. The method of payment is an attribute of the contract itself,
and not of any of the three parties.

Rows versus Records

Rows are not records. A record is defined in the application program
which reads it; a row is defined in the database schema and not by a
program at all. The name of the field in the READ or INPUT statements
of the application; a row is named in the database schema.

All empty files look alike; they are a directory entry in the operating
system with a name and a length of zero bytes of storage. Empty tables
still have columns, constraints, security privileges and other
structures, even tho they have no rows.

This is in keeping with the set theoretical model, in which the empty
set is a perfectly good set. The difference between SQL's set model and
standard mathematical set theory is that set theory has only one empty
set, but in SQL each table has a different structure, so they cannot be
used in places where non-empty versions of themselves could not be used.

Another characteristic of rows in a table is that they are all alike in
structure and they are all the "same kind of thing" in the model. In a
file system, records can vary in size, datatypes and structure by having
flags in the data stream that tell the program reading the data how to
interpret it. The most common examples are Pascal's variant record, C's
struct syntax and Cobol's OCCURS clause.

The OCCURS keyword in Cobol and the Variant records in Pascal have a
number which tells the program how many time a record structure is to be
repeated in the current record.

Unions in 'C' are not variant records, but variant mappings for the same
physical memory. For example:

union x {int ival; char j[4];} myStuff;

defines myStuff to be either an integer (which are 4 bytes on most
modern C compilers, but this code is non-portable) or an array of 4
bytes, depending on whether you say myStuff.ival or myStuff.j[0];

But even more than that, files often contained records which were
summaries of subsets of the other records -- so called control break
reports. There is no requirement that the records in a file be related
in any way -- they are literally a stream of binary data whose meaning
is assigned by the program reading them.

Columns versus Fields

A field within a record is defined by the application program that reads
it. A column in a row in a table is defined by the database schema.
The datatypes in a column are always scalar.

The order of the application program variables in the READ or INPUT
statements is important because the values are read into the program
variables in that order. In SQL, columns are referenced only by their
names. Yes, there are shorthands like the SELECT * clause and INSERT
INTO <table name> statements which expand into a list of column names in
the physical order in which the column names appear within their table
declaration, but these are shorthands which resolve to named lists.

The use of NULLs in SQL is also unique to the language. Fields do not
support a missing data marker as part of the field, record or file
itself. Nor do fields have constraints which can be added to them in
the record, like the DEFAULT and CHECK() clauses in SQL.

Relationships among tables within a database

Files are pretty passive creatures and will take whatever an application
program throws at them without much objection. Files are also
independent of each other simply because they are connected to one
application program at a time and therefore have no idea what other
files looks like.

A database actively seeks to maintain the correctness of all its data.
The methods used are triggers, constraints and declarative referential
integrity.

Declarative referential integrity (DRI) says, in effect, that data in
one table has a particular relationship with data in a second (possibly
the same) table. It is also possible to have the database change itself
via referential actions associated with the DRI.

For example, a business rule might be that we do not sell products which
are not in inventory. This rule would be enforce by a REFERENCES clause
on the Orders table which references the Inventory table and a
referential action of ON DELETE CASCADE

Triggers are a more general way of doing much the same thing as DRI. A
trigger is a block of procedural code which is executed before, after or
instead of an INSERT INTO or UPDATE statement. You can do anything with
a trigger that you can do with DRI and more.

However, there are problems with TRIGGERs. While there is a standard
syntax for them in the SQL-92 standard, most vendors have not
implemented it. What they have is very proprietary syntax instead.
Secondly, a trigger cannot pass information to the optimizer like DRI.
In the example in this section, I know that for every product number in
the Orders table, I have that same product number in the Inventory
table. The optimizer can use that information in setting up EXISTS()
predicates and JOINs in the queries. There is no reasonable way to
parse procedural trigger code to determine this relationship.

The CREATE ASSERTION statement in SQL-92 will allow the database to
enforce conditions on the entire database as a whole. An ASSERTION is
not like a CHECK() clause, but the difference is subtle. A CHECK()
clause is executed when there are rows in the table to which it is
attached. If the table is empty then all CHECK() clauses are
effectively TRUE. Thus, if we wanted to be sure that the Inventory
table is never empty, and we wrote:

CREATE TABLE Inventory
( ...
CONSTRAINT inventory_not_empty
CHECK ((SELECT COUNT(*) FROM Inventory) > 0), ... );

it would not work. However, we could write:

CREATE ASSERTION Inventory_not_empty
CHECK ((SELECT COUNT(*) FROM Inventory) > 0);

and we would get the desired results. The assertion is checked at the
schema level and not at the table level.

--CELKO--
Please post DDL, so that people do not have to guess what the keys,
constraints, Declarative Referential Integrity, datatypes, etc. in your
schema are. Sample data is also a good idea, along with clear
specifications.
*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!

Jul 20 '05 #10

Daven Thrice

"toedipper" <se******************@hotmail.com> wrote in message
news:30*************@uni-berlin.de...

Thanks for your reply.

Totally of topic but what does 'sic' mean? I see this all over the place
both on the web and in print.

It means that what was written was written that way intentionally. For
example, one might say that they when they write, they mispel a lot of words
[sic].

There's a pretty good dictionary at http://m-w.com

Also, if you go to google and search for any word, you'll see that word in
the upper-right quadrant of the screen, next to a link that says
[definition]. Click that link and you'll get the definition.

Jul 20 '05 #11

Lyle Fairfield

Joe Celko <jc*******@earthlink.net> wrote in news:41a65261$0$14485
$c******@news.newsgroups.ws:

why do you kids know all the internet codes and emoticons, but not
Latin? "sic" means "error in the original" [Sic].

Jul 20 '05 #12

Hilarion

>> Could you point the differences, please?

http://www.google.com/groups?selm=e%...phx.gbl&rnum=6

This post uses terms "records" and "fields" as filesystem terms
(portions of bytes stored in files) and as programming language
structures (eg. unions and structs in C/C++).
I use those terms for logical structures desciption, not only
physical representation. For me a struct in C/C++ is a way to
represent/implement logical record, just as it can implement
table row (it may be not a very good idea).

I agree with the reply to that post:
http://www.google.com/groups?hl=pl&l...TNGP11.phx.gbl

It's only a matter of treating terms in logical or physical
manner. I can think of terms "column" and "row" in prysical
manner too (eg. for C/C++/Pascal tables or other more or less
complex programming language structures).

The main idea of relational databases is to have relations
between informations. Storing the informations in one
record of one table is one way to represent the relation,
the other one is to use FOREIGN KEYs. It's not that important
if we call smallest part of information a "field" or "column"
or "cell" (the last one is for me the best one in terms of
"columns" and "rows" and very close to "field" term) as long
as we know what we are describing and are able to understand
what another person is trying to describe.
For me much better term than "table" is "set", cause table
suggests sequence of information.

Hilarion

Jul 20 '05 #13

-P-

"toedipper" <se******************@hotmail.com> wrote in message news:30*************@uni-berlin.de...

Hello,

I am designing a table of vehicle types, nothing special, just a list of
unique vehicle types such as truck, lorry, bike, motor bike, plane, tractor
etc etc

For the table design I am proposing a single column table with a field name
called vehicle_type and this will contain the vehicle type.

Sot it will be

vehicle_type
car
bike
tractor
plane
truck
van
blah
blah
blah

Is this ok? Or is there a better way to do it?

Thanks,

td.

I've never liked using the "descriptive name" of the entity as its primary key. What happens if that descriptive name
changes, and the table is referenced as a foreign key by other tables? You've got a referential integrity problem
(unless you endorse the use of ON UPDATE CASCADE, which I personally abhor.)

This is where I would use a separate identifier as the primary key. Something that will never change (an
autoincremented integer, for example), so that the "description" column can contain the more volatile descriptive text.

--
Paul Horan
Sr. Architect
VCI Springfield, Mass
www.vcisolutions.com

Jul 20 '05 #14

Greg D. Moore $Strider$

"-P-" <en**********@hotmail.DOTcom> wrote in message
news:fq********************@adelphia.com...

This is where I would use a separate identifier as the primary key. Something that will never change (an autoincremented integer, for example), so that the "description" column can contain the more volatile descriptive text.
You realize an autoincremented integer, at least in MS SQL Server is
terrible for this. You can't guarantee that you won't have gaps and you
can't even guarantee that the numbers will remain the same. DBCC CHECKIDENT
can reset things on you, copying them to another DB may completely change
the numbers, etc.

You are right that the descriptions may change. I'd recommend something
like a partial VIN since that's described by an outside authority and pretty
much won't change.

--
Paul Horan
Sr. Architect
VCI Springfield, Mass
www.vcisolutions.com

Jul 20 '05 #15

Hilarion

> You realize an autoincremented integer, at least in MS SQL Server is

terrible for this. You can't guarantee that you won't have gaps

What's with those gaps? I always wonder. Who cares about them? It's not
a row number, but a key, so gaps are OK.

Hilarion

Jul 20 '05 #16

David Portas

You've failed to make the vital distinction between a *surrogate* key and a
*natural* key. You have no data integrity if your data looks like this:

id vehicle
----------- -----------
1 Car
2 Car
3 Truck

An autoincrementing surrogate key should never be the only key of a table.

--
David Portas
SQL Server MVP
--

Jul 20 '05 #17

-P-

"Greg D. Moore (Strider)" <mo****************@greenms.com> wrote in message
news:9E*****************@twister.nyroc.rr.com...

"-P-" <en**********@hotmail.DOTcom> wrote in message
news:fq********************@adelphia.com...
This is where I would use a separate identifier as the primary key.

Something that will never change (an
autoincremented integer, for example), so that the "description" column

can contain the more volatile descriptive text.

You realize an autoincremented integer, at least in MS SQL Server is
terrible for this. You can't guarantee that you won't have gaps and you
can't even guarantee that the numbers will remain the same. DBCC CHECKIDENT
can reset things on you, copying them to another DB may completely change
the numbers, etc.

You are right that the descriptions may change. I'd recommend something
like a partial VIN since that's described by an outside authority and pretty
much won't change.

"Not having gaps" wasn't specified as a requirement... <G> I think gaps in identity columns are perfectly fine -
unless you're exposing the number to the user, as with an invoice number or some other "auditable" entity. In that
case, I wouldn't use IDENTITY.

-P-

Jul 20 '05 #18

-P-

"David Portas" <RE****************************@acm.org> wrote in message news:1P********************@giganews.com...

You've failed to make the vital distinction between a *surrogate* key and a *natural* key. You have no data integrity
if your data looks like this:

id vehicle
----------- -----------
1 Car
2 Car
3 Truck

An autoincrementing surrogate key should never be the only key of a table.

Baloney... A simple unique index or constraint on the description would prevent the scenario you just described.

"Never"? I also disagree with that statement. There are perfectly acceptable uses for that design.

-Paul-

Jul 20 '05 #19

Hilarion

> I agree with the reply to that post:

http://www.google.com/groups?hl=pl&l...TNGP11.phx.gbl

I mean I agree to the post pointed by the URL, and which is a replty to the
oryginal post.

Hilarion

Jul 20 '05 #20

David Portas

> A simple unique index or constraint on the description would prevent the

scenario you just described.
That's what I was suggesting. You didn't mention the importance of declaring
the natural key so I was just making the point for the benefit of the OP.
"Never"? I also disagree with that statement. There are perfectly
acceptable uses for that design.

As part of an ETL process it may be acceptable in a staging table to use
only an artificial key. Not in a relational schema. Without a natural key by
implication you have redundant data and by definition no meaningful way to
define the entity you are modelling.

The cases I've come across that perhaps justify an exception to this are
when you have an automatic, event-driven process which logs to a table
without human intervention. It may not be feasible as part of a real-time
logging process to ensure that a natural key is enforced. The sequence of
events is the data you are attempting to capture but the database system may
not support a date/timestamp of sufficient precision to guarantee that each
event has a unique time. This is really a problem of application design
rather than relational database design. You still generate redundant data
but in a log that isn't always a big problem.

--
David Portas
SQL Server MVP
--

Jul 20 '05 #21

Erland Sommarskog

Greg D. Moore (Strider) (mo****************@greenms.com) writes:

You realize an autoincremented integer, at least in MS SQL Server is
terrible for this. You can't guarantee that you won't have gaps and you
can't even guarantee that the numbers will remain the same. DBCC
CHECKIDENT can reset things on you, copying them to another DB may
completely change the numbers, etc.

Gaps are a non-issue in this case.

That said, you are terribly lazy if you need to have auto-incremented
ids for a simple lookup table. Better assigned the ids manually.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Jul 20 '05 #22

-P-

"David Portas" <RE****************************@acm.org> wrote in message news:GL********************@giganews.com...

A simple unique index or constraint on the description would prevent the scenario you just described.

That's what I was suggesting. You didn't mention the importance of declaring the natural key so I was just making the
point for the benefit of the OP.
"Never"? I also disagree with that statement. There are perfectly acceptable uses for that design.

As part of an ETL process it may be acceptable in a staging table to use only an artificial key. Not in a relational
schema. Without a natural key by implication you have redundant data and by definition no meaningful way to define the
entity you are modelling.

The cases I've come across that perhaps justify an exception to this are when you have an automatic, event-driven
process which logs to a table without human intervention. It may not be feasible as part of a real-time logging
process to ensure that a natural key is enforced. The sequence of events is the data you are attempting to capture but
the database system may not support a date/timestamp of sufficient precision to guarantee that each event has a unique
time. This is really a problem of application design rather than relational database design. You still generate
redundant data but in a log that isn't always a big problem.

We still disagree. I would never use autoincrement in situations where a natural key was evident and available, but I
still contend that there is a place for autoincrement in the relational model. What about entities that have no
identifiable "natural" key? We have several in our model, and had to invent an identification scheme for them.

--
Paul Horan

Jul 20 '05 #23

David Portas

> What about entities that have no identifiable "natural" key? We have

several in our model, and had to invent an identification scheme for them.

There are two separate issues and maybe neither of us have spelt them out
well enough.

1) When you model an entity in a relational database it must always have a
natural key by definition (= "a subset of the attributes that uniquely
identifies a row"), otherwise you have redundancy and no integrity. For
example that key could be the vehicle description in Toedipper's case. If
you have a table without a natural key then someone has failed to identify
the entity properly in the logical model. However the natural key may not
necessarily be convenient for use as a foreign key (because of storage or
performance considerations for example).

2) In cases where the data changes infrequently it may additionally be
desirable to create your own user-assigned surrogate key, such as "C" for
"Car" for example. In other cases, a system-assigned surrogate (such as
IDENTITY in SQL Server) is often used. Surrogate keys are not a substitute
for a natural key, which should still be declared as a key of the table.
Unfortunately system-assigned "row-identifier" surrogates are too often used
carelessly by those who fail to design a proper logical model and think they
don't need real keys. System-assigned surrogates are not part of the logical
data model at all - they are part of the physical implementation.

There is of course room for a great deal of debate about the wisdom or
otherwise of using system-assigned surrogate keys at all. There are
reasonable arguments on both sides and I don't want to go over that debate
again here. What isn't usually disputed is that a surrogate key should never
be the ONLY key of a table. As soon as you compromise that principle your
data model is lost and you have big logical problems with data integrity and
often insurmountable practical problems when it comes to getting meaningful
information from the data.

--
David Portas
SQL Server MVP
--

Jul 20 '05 #24

Erland Sommarskog

David Portas (RE****************************@acm.org) writes:

There is of course room for a great deal of debate about the wisdom or
otherwise of using system-assigned surrogate keys at all. There are
reasonable arguments on both sides and I don't want to go over that
debate again here. What isn't usually disputed is that a surrogate key
should never be the ONLY key of a table. As soon as you compromise that
principle your data model is lost and you have big logical problems with
data integrity and often insurmountable practical problems when it comes
to getting meaningful information from the data.

I would say that there certainly are cases where some sort of system-
generated key is the only possible key. Take for instance a table with
account transactions. You can fairly well describe a transaction by using
account number, date and time of day. But there may be two transactions
for the same account in the same millisecond, so those three alone cannot
make a key. You can then try to find some constraint that distinguishes
two transactions that happen at the same time. (Typically they would be
generated by some batch process.) But you would then only trying to intro-
duce a constraint that has no relation to business rules, and one day
you will get a failure for a perfectly valid transaction, because it
did not fit into the squared model. That's when a surrogate key is the
way to go.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Jul 20 '05 #25

David Portas

That depends on how you define the transaction entity. For example:

Account Timestamp Amount
------- ----------------------- -------
1234 2004-11-01 09:10:01.104 512.99
1234 2004-11-01 09:10:01.104 512.00
might be adequately represented as:

Account Timestamp Amount
------- ----------------------- -------
1234 2004-11-01 09:10:01.104 1024.99
Or even:

Account Timestamp Amount Trancount
------- ----------------------- ------- ---------
1234 2004-11-01 09:10:01.104 1024.99 2
In reality, at least in the financial systems I have worked with,
transactions contain more information that this. They are identified as part
of a batch by a unique batch number or journal number which is assigned at
the time the batch is generated. The batch number is itself a surrogate for
an entity composed of something like (orignating_entity, location,
datetime).

--
David Portas
SQL Server MVP
--

Jul 20 '05 #26

Erland Sommarskog

David Portas (RE****************************@acm.org) writes:

That depends on how you define the transaction entity. For example:

Account Timestamp Amount
------- ----------------------- -------
1234 2004-11-01 09:10:01.104 512.99
1234 2004-11-01 09:10:01.104 512.00
might be adequately represented as:

Account Timestamp Amount
------- ----------------------- -------
1234 2004-11-01 09:10:01.104 1024.99
That is very likely to be completely unacceptable.
In reality, at least in the financial systems I have worked with,
transactions contain more information that this.
Yes, there is very likely to be more information: transaction type,
transaction text etc. And some of these may be different, which is
why you cannot collapse two transactions into one. But the problem is
that you would have include about every column in the table, to not
put up a roadblock for a pair of transactions that are valid for the
real-world business.
They are identified as part of a batch by a unique batch number or
journal number which is assigned at the time the batch is generated. The
batch number is itself a surrogate for an entity composed of something
like (orignating_entity, location, datetime).

Yes, there may be such a thing. But not all transactions may be booked
in this way.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Jul 20 '05 #27

David Portas

> you would have include about every column in the table, to not

put up a roadblock for a pair of transactions that are valid for the
real-world business.

Q.E.D.

--
David Portas
SQL Server MVP
--

Jul 20 '05 #28

-P-

I've seen models that take the use of autoincrement to the extreme, where EVERY table has a sequence column as the PK,
and that's also wrong. I'm certainly not advocating that - but I do take issue with your assertion that there is NO
PLACE for autoincrement in a relational database.

For example, our Order table for storing commercial airtime orders from Ad agencies to TV stations... There are about
30 columns that help to describe the parameters of the order, including the Advertiser_ID, an Agency_ID (both fk
references to the Name_Address table), some dates, various and sundry codes, revision counters, and some descriptive
text... You're suggesting that we find the 7 or 8 columns that combine to uniquely identify an Order (including the
aforementioned descriptive text column) and call those the primary key? And then, we get to replicate all that data on
every row of every table that references Order (roughly 20 additional tables, up to 5 levels deep in places).

This poor design is solved by creating the Order_ID column as an autoincrementing number, and using that as the primary
key. The references to Order then use Order_ID as the foreign key. No muss, no fuss. Significantly less duplication
of data, greatly increased performance for JOIN processing, and MUCH easier to work with from a development standpoint.

-Paul-
"David Portas" <RE****************************@acm.org> wrote in message news:m5********************@giganews.com...

you would have include about every column in the table, to not
put up a roadblock for a pair of transactions that are valid for the
real-world business.

Q.E.D.

--
David Portas
SQL Server MVP
--

Jul 20 '05 #29

David Portas

> I do take issue with your assertion that there is NO PLACE for

autoincrement in a relational database.
I didn't say that. Some people take the view that you should never use
"autoincrementing" keys but I'm not one of them.

You're suggesting that we find the 7 or 8 columns that
combine to uniquely identify an Order
Identifying the keys is an essential part of the process of designing the
logical model anyway.

(including the aforementioned descriptive text column) and call
those the primary key? And then, we get to replicate all that data
on every row of every table that references Order (roughly 20
additional tables, up to 5 levels deep in places).
No. You missed my point. Yes, create a compact surrogate key if you need to
and use that in other tables as the foreign key. But it is still essential
ALSO to declare the *natural* key columns. You don't have to duplicate the
natural key in any referencing tables but you do have to *declare* the key
in the parent table. From before: "An autoincrementing surrogate key should
never be the only key of a table." The key word is "only". That was the
assertion that you originally disagreed with but none of what you have said
contradicts that statement so maybe we agree after all. :-)

This poor design is solved by creating the Order_ID column as an
autoincrementing number, and using that as the primary key. The
references to Order then use Order_ID as the foreign key. No muss, no
fuss. Significantly less duplication of data, greatly increased
performance for JOIN processing, and MUCH easier to work with from a
development standpoint.

Agreed. System-generated "autoincrementing" keys aren't the only option for
a surrogate key but they do have their uses. I don't have anything against
surrogate keys - only against poorly designed tables without natural keys,
mainly because in my career I've spent a lot of time identifying and fixing
problems caused by other people's weak schema designs.
--
David Portas
SQL Server MVP
--

Jul 20 '05 #30

Erland Sommarskog

David Portas (RE****************************@acm.org) writes:

Identifying the keys is an essential part of the process of designing the
logical model anyway.

And I say it again: that identification process could result in that you
realize that there is no real-world key which meets the requirements of the
relational model.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Jul 20 '05 #31

David Portas

Logically impossible. Any relation {A,B,C} represents a fact. If the fact is
representative of some quantity that you need to represent N times then you
can always add another attribute to represent the count of N: {A,B,C,n}. You
can therefore always record any set of facts without violating any key. In a
table that doesn't represent quantitative facts there is obviously no
problem at all - each fact need only be recorded exactly once.

The idea that a particular schema design is a consequence of Business
Process is a common falacy I believe. A relational schema *models* a
business process it is not *determined* by it. There are always choices to
be made in the design and business constraints are not an excuse for poor
design.

--
David Portas
SQL Server MVP
--

Jul 20 '05 #32

Erland Sommarskog

David Portas (RE****************************@acm.org) writes:

Logically impossible. Any relation {A,B,C} represents a fact. If the
fact is representative of some quantity that you need to represent N
times then you can always add another attribute to represent the count
of N: {A,B,C,n}. You can therefore always record any set of facts
without violating any key. In a table that doesn't represent
quantitative facts there is obviously no problem at all - each fact need
only be recorded exactly once.
Yes, you might be able to add another attribute, but in the end you have an
attribute list which is long as the universe, and which have no practical
usage, and only serves to make the system more difficult to use and less
effecient.
The idea that a particular schema design is a consequence of Business
Process is a common falacy I believe. A relational schema *models* a
business process it is not *determined* by it. There are always choices to
be made in the design and business constraints are not an excuse for poor
design.

Real-world systems are built to solve business problems, not to appease
the ideas of relational theory. Just as always using an surrogate key is
poor design, it is also poor design to always define a natural key.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Jul 20 '05 #33

cwepema

toedipper wrote:

Thanks for your reply.

Totally of topic but what does 'sic' mean? I see this all over the place
both on the web and in print.

Thanks,

TD.
"--CELKO--" <jc*******@earthlink.net> wrote in message
news:18**************************@posting.google.c om...
For the table design I am proposing a single column table with a

field [sic] name called vehicle_type and this will contain the vehicle
type... Is this ok? <<

Sure, if you remember to make the one column the primary key. It is
weird just hanging out there in space without anything in the schema,
but it is legal. And fields are not anything like column.

According to the dictionairy it (sic) means:
1. To set upon; attack.
2. To urge or incite to hostile action; set: sicced the dogs on the
intruders.

In sms language it means "As I See". I think.
Regards Kees

Jul 23 '05 #34

James Goodwin

sic1
adv.
Thus; so. Used to indicate that a quoted passage, especially one
containing an error or unconventional spelling, has been retained in its
original form or written intentionally.

Regards,
Jim

"cwepema" <cw*****@presys.nl> wrote in message
news:Zu******************@newsfe01.lga...

toedipper wrote:
Thanks for your reply.

Totally of topic but what does 'sic' mean? I see this all over the place both on the web and in print.

Thanks,

TD.
"--CELKO--" <jc*******@earthlink.net> wrote in message
news:18**************************@posting.google.c om...
>For the table design I am proposing a single column table with a

field [sic] name called vehicle_type and this will contain the vehicle
type... Is this ok? <<

Sure, if you remember to make the one column the primary key. It is
weird just hanging out there in space without anything in the schema,
but it is legal. And fields are not anything like column.

According to the dictionairy it (sic) means:
1. To set upon; attack.
2. To urge or incite to hostile action; set: sicced the dogs on the
intruders.

In sms language it means "As I See". I think.
Regards Kees

Jul 23 '05 #35

--CELKO--

How do think that Fed Wire and the other banking networks handle their
transactions? It ain't GUIDs and IDENTITY columns; it is very
intelligent keys that can be verified.

Jul 23 '05 #36

Pero Periæ

tzutrfi8t797to0zh9ð'0uzjhð+u'0
"toedipper" <se******************@hotmail.com> wrote in message
news:30*************@uni-berlin.de...

Hello,

I am designing a table of vehicle types, nothing special, just a list of
unique vehicle types such as truck, lorry, bike, motor bike, plane,
tractor
etc etc

For the table design I am proposing a single column table with a field
name
called vehicle_type and this will contain the vehicle type.

Sot it will be

vehicle_type
car
bike
tractor
plane
truck
van
blah
blah
blah

Is this ok? Or is there a better way to do it?

Thanks,

td.

Jul 23 '05 #37

Similar topics