Database abstraction layers

Colin McKinnon

Hi all,

There's lots of DB abstraction layers out there, but a quick look around
them hasn't turned up anything which seems to met my requirements. Before I
go off and write one I thought I'd ask here if anyone knows of such a
beast...

I want some code where I present an array of data, and the corresponding
primary key and let the code work out whether to INSERT or UPDATE it, I
also want to be able to present the data from a QBF or QBE then be able to
step through the result set. However I don't want to have to configure the
DBMS structure - after all most of it is already in the DBMS (OK so not the
relationships in a MySQL db). It'd be really cool if I could throw SQL
directly at it *too*.

Anybody any suggestions?

TIA,

C.

May 11 '06 #1

Subscribe Post Reply

2529

Iván Sánchez Ortega

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Colin McKinnon wrote:

I want some code where I present an array of data, and the corresponding
primary key and let the code work out whether to INSERT or UPDATE it,[...] [...] Anybody any suggestions?

Use MySQL and REPLACE statements.

- --
- ----------------------------------
Iván Sánchez Ortega -i-punto-sanchez--arroba-mirame-punto-net

Un ordenador no es un televisor ni un microondas, es una herramienta
compleja.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)

iD8DBQFEY8CT3jcQ2mg3Pc8RAl57AJ901ARs69nLk3mptIDYKl fLDCeBogCfft10
HOi1URDxXXfp8rNg1RoHIGI=
=93vZ
-----END PGP SIGNATURE-----

May 11 '06 #2

Toby Inkster

IvÃ¡n SÃ¡nchez Ortega wrote:

Colin McKinnon wrote:
I want some code where I present an array of data, and the corresponding
primary key and let the code work out whether to INSERT or UPDATE it,[...]

[...]
Anybody any suggestions?

Use MySQL and REPLACE statements.

Very abstract.

One technique for the OP might be to prepare both an INSERT and an UPDATE
statement. Run the INSERT statement; if it fails then there's probably
already a row with that primary key, so run the UPDATE statement. Or you
could run UPDATE first, check the number of rows updated; if 0, run the
INSERT. Not particularly clean, but should be quite portable.

--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ http://tobyinkster.co.uk/contact

May 12 '06 #3

Colin McKinnon wrote:

There's lots of DB abstraction layers out there, but a quick look around
them hasn't turned up anything which seems to met my requirements.
Which probably means that you shouldn't even attempt to solve your
problem with an abstraction layer; easier solutions might be
available...
I want some code where I present an array of data, and the corresponding
primary key and let the code work out whether to INSERT or UPDATE it

You don't need PHP code for that; this can be handled on the database
level:

INSERT INTO the_table
SET field1=value1, field2=value2, field3=value3, ...
ON DUPLICATE KEY UPDATE field1=value1, field2=value2, field3=value3,
....

This syntax is available since MySQL 4.1.0.

You could also use a REPLACE query, but keep in mind that it works by
deleting the old record and inserting a new one, so if there are fields
that exist in the table, but not in the query, they will be lost.

Cheers,
NC

May 12 '06 #4

Kenneth Downs

Colin McKinnon wrote:

Are you married to mySQL?

I've created a dictionary-based comprehensive tool that uses Postgres as the
back end and runs on linux, and is GPL.

http://docs.secdat.com

I never code inserts or updates ever. My app code is full of stuff like:

$row=array("colname"=>value,"col2"=>value)
SQLX_Insert($table_dd,$row);

You can also do:

SQLX_InsertOrUpdate($table,$row)

which looks for the existing row and does an appropriate insert or update as
the case may be.

You can make a big batch of rows like:

$rows=array(
"table1"=>array(
array( .... row .....)
,array( .... row .....)
,array( .... row .....)
)
,"table2"=>array(
array( .... row .....)
,array( .... row .....)
,array( .... row .....)
)
);
SQLX_TransactionBegin();
SQLX_InsertsOrUpdates($rows);
SQLX_TransactionEnd("NAME");

If there is an error in the transaction, it gets logged and tagged with
"NAME" so that it can be examined later.

You can also grab anything you need to know from the online dictionary. The
complete description of every column in every table is available in the
"flat" array for each table:

$dd=dd_tableref("orders");
echo "The description of order # is: ".$dd['flat']['order']['description']

This includes detailed information about foreign keys.

There is a huge amount of database-side automation as well.

Our first system went live a few weeks ago and two more are coming soon. I
consider it late beta for my own use because i know all of the foibles it
has, but it is probably early alpha for somebody picking it up right now.

Hi all,

There's lots of DB abstraction layers out there, but a quick look around
them hasn't turned up anything which seems to met my requirements. Before
I go off and write one I thought I'd ask here if anyone knows of such a
beast...

I want some code where I present an array of data, and the corresponding
primary key and let the code work out whether to INSERT or UPDATE it, I
also want to be able to present the data from a QBF or QBE then be able to
step through the result set. However I don't want to have to configure the
DBMS structure - after all most of it is already in the DBMS (OK so not
the relationships in a MySQL db). It'd be really cool if I could throw SQL
directly at it *too*.

Anybody any suggestions?

TIA,

C.

--
Kenneth Downs
Secure Data Software, Inc.
(Ken)nneth@(Sec)ure(Dat)a(.com)

May 12 '06 #5

Thanks Kenneth - sounds interesting.

I had a quick look....

How do you deal with the problem of updating the primary key of a
record? Cascade update/deletes?

TIA,

C.

May 12 '06 #6

Dikkie Dik

> I want some code where I present an array of data, and the corresponding

primary key and let the code work out whether to INSERT or UPDATE it,
Quite simple. If the primary key in your application is numeric, you got
it from the database. Otherwise, how would you know the key value?
If the primary key value is NULL, it is a new record. even if it
contains the same non-key data as an existing record, it is different
anyhow. If you have more than one instance representing one record, you
may look at http://www.w-p.dds.nl/article/wrtabrec.htm for how to fix
that (shamelessly plugging my own article).
I also want to be able to present the data from a QBF or QBE then be able to
step through the result set. However I don't want to have to configure the
DBMS structure - after all most of it is already in the DBMS (OK so not the
relationships in a MySQL db). It'd be really cool if I could throw SQL
directly at it *too*.

I think having to configure a mapper is a bad code smell. My biggest
problem today is the dependency across systems. You probably have
encountered situations where file names were in the database, or where
constants in an application referred to enumerations in a database
field. These dependencies cannot be enforced (no one will stop you from
deleting a file if its name is in a database, for example). If you have
a configuration file for a mapper, you introduce *yet another* non
enforceable dependency, along with an untestable system.

But then, I always write my mappings myself. Common code gets into
superclasses and the strict typing (in languages that support it) and
the communication stategies into the wrappers. So definition/lookup
tables become read-only collections, for instance. For each table, I
decide whether I want it to be lazy, greedy, preloadable (you can
"schedule" a record without loading it yet, but it will be loaded at the
first necessary database action), or whatever other combination of lazy
and greedy.
Always remember that a class should encapsulate and hide its internal
structure. If a table strategy would change the interface of its
wrapping collection, it is not a useful wrapper.

Best regards

May 12 '06 #7

Colin McKinnon

Dikkie Dik wrote:

I want some code where I present an array of data, and the corresponding
primary key and let the code work out whether to INSERT or UPDATE it,
Quite simple. If the primary key in your application is numeric, you got
it from the database. Otherwise, how would you know the key value?
If the primary key value is NULL, it is a new record. even if it
contains the same non-key data as an existing record, it is different
anyhow.

That only really works if you don't bother normalizing your data but chuck
in an auto-incrementing record identifier (the rot started with MS-Access
on that one, but its far too common in web apps).

It all rather falls apart when you're dealing with very large sets of
normalized data - what about updating the data within the PK? Cascade
updates?
If you have more than one instance representing one record, you
may look at http://www.w-p.dds.nl/article/wrtabrec.htm for how to fix
that (shamelessly plugging my own article).
I also want to be able to present the data from a QBF or QBE then be able
to step through the result set. However I don't want to have to configure
the DBMS structure - after all most of it is already in the DBMS (OK so
not the relationships in a MySQL db). It'd be really cool if I could
throw SQL directly at it *too*.
I think having to configure a mapper is a bad code smell. My biggest
problem today is the dependency across systems. You probably have
encountered situations where file names were in the database, or where
constants in an application referred to enumerations in a database
field. These dependencies cannot be enforced (no one will stop you from
deleting a file if its name is in a database, for example). If you have
a configuration file for a mapper, you introduce *yet another* non
enforceable dependency, along with an untestable system.

I think that's the point I was making - there should be enough description
of the structure accessible via the DDL to make that redundant. In fact
using just SQL on its own its not sufficient - even with foreign key
declarations and constraints. Having said that there is a lot you can do
with what is avilable even in MySQL. Particularly if you maintain a strict
domain naming strategy.

In my search I came across a couple of systems which seemed to come close to
what I wanted, but both used a static description of the data structure
which was maintained independently of the DBMS - yes that causes problems.
But then, I always write my mappings myself. Common code gets into
superclasses and the strict typing (in languages that support it) and
the communication stategies into the wrappers. So definition/lookup
tables become read-only collections, for instance. For each table, I
decide whether I want it to be lazy, greedy, preloadable (you can
"schedule" a record without loading it yet, but it will be loaded at the
first necessary database action), or whatever other combination of lazy
and greedy.
Always remember that a class should encapsulate and hide its internal
structure. If a table strategy would change the interface of its
wrapping collection, it is not a useful wrapper.

You're the one who started talking about OO. I spend my working hours fixing
problems that have arisen from O-R impedance mismatch. I think the idea of
using a relational database as a persistance layer for an OO application is
fundamentally flawed. While I admit that it does take more work in
producing the initial version of an application, its not really much effort
to extrapolate a normalized database design and implement the factories to
populate the runtime rather than simply a 1:1 object:record mapping - and
the benefits are huge when you start looking at whole lifecycle -
particularly post-release development. This was one of my reasons for
tackling the problem in this way - to eliminate that hurdle. It also means
that the abstraction is applicable in other idioms.

But the article is interesting.

C.

May 12 '06 #8

Toby Inkster

Dikkie Dik wrote:

Quite simple. If the primary key in your application is numeric, you got
it from the database. Otherwise, how would you know the key value?

Because you know it?

For example, if I'm keeping a table of my customers' credit cards, I might
use:

customer_id (int, foreign key)
card_number (big int, primary key)
card_holder_name (varchar)
expiry_date (timestamp)

If I'm inserting a new record into the table then I already know the
primary key, don't I?

--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ http://tobyinkster.co.uk/contact

May 13 '06 #9

Dikkie Dik

>> Quite simple. If the primary key in your application is numeric, you got

it from the database. Otherwise, how would you know the key value?

Because you know it?

For example, if I'm keeping a table of my customers' credit cards, I might
use:

customer_id (int, foreign key)
card_number (big int, primary key)
card_holder_name (varchar)
expiry_date (timestamp)

If I'm inserting a new record into the table then I already know the
primary key, don't I?

No, you don't. You know a FIELD value of your record, NOT its primary
key. That you wish to combine them (that is always going to get you in
trouble sooner or later) does not change anything. For clarity: A
primary key is only a unique pointer to a record and nothing more.
Especially, primary key values should never be related to the data in
the record.

Let me show it. Suppose this card object is a new one. In this case, you
do NOT know its primary key. You can't, because the primary key value
does not exist yet. Oh, you might know what it WILL be, but I could also
know what it would be if I used an autonumber. The card number column in
you table serves to purposes: it holds both the identity of the record
and some data.
You don't believe me? Just lookup your "known" record in the database.
You won't find it. It does not exist. You will only find it if you know
its primary key value. And for the record to have a primary key value,
it must exist in the database first.

But apart from that, your code does know whether the record comes from
the database or from another source. If you can't trust your primary
keys, you could have the table wrapper just set a boolean when getting
the record from the database. The original question was: how do you know
that a record comes from the database? If you want to know that, your
database code has to be held responsible for keeping track of that. If
you can't do that with the primary key because it interferes with data,
just find another way. Anyhow, your table wrapper knows if it came from
the storage.

May 13 '06 #10

Jerry Stuckle

Dikkie Dik wrote:

Quite simple. If the primary key in your application is numeric, you
got it from the database. Otherwise, how would you know the key value?

Because you know it?

For example, if I'm keeping a table of my customers' credit cards, I
might
use:

customer_id (int, foreign key)
card_number (big int, primary key)
card_holder_name (varchar)
expiry_date (timestamp)

If I'm inserting a new record into the table then I already know the
primary key, don't I?

No, you don't. You know a FIELD value of your record, NOT its primary
key. That you wish to combine them (that is always going to get you in
trouble sooner or later) does not change anything. For clarity: A
primary key is only a unique pointer to a record and nothing more.
Especially, primary key values should never be related to the data in
the record.

Actually, that depends on the primary key. It might not be an autoincrement
column, for instance. I've had it be a equipment id (inventory), part number,
(inventory again), SSN (banking transaction), phone number (non-profit call
list) and many other things. A primary key only identifies a unique row; it
does not have to be an auto-increment column, and in many cases something else
does make more sense.
Let me show it. Suppose this card object is a new one. In this case, you
do NOT know its primary key. You can't, because the primary key value
does not exist yet. Oh, you might know what it WILL be, but I could also
know what it would be if I used an autonumber. The card number column in
you table serves to purposes: it holds both the identity of the record
and some data.
Again, it depends on your primary key. The program logic shouldn't even care
what the primary key is - just field values.
You don't believe me? Just lookup your "known" record in the database.
You won't find it. It does not exist. You will only find it if you know
its primary key value. And for the record to have a primary key value,
it must exist in the database first.

True, it won't exist. But you don't necessarily know that unless you check.
But apart from that, your code does know whether the record comes from
the database or from another source. If you can't trust your primary
keys, you could have the table wrapper just set a boolean when getting
the record from the database. The original question was: how do you know
that a record comes from the database? If you want to know that, your
database code has to be held responsible for keeping track of that. If
you can't do that with the primary key because it interferes with data,
just find another way. Anyhow, your table wrapper knows if it came from
the storage.

Again, the program itself should not know nor should it care about primary keys.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================

May 14 '06 #11

Dikkie Dik

<snip>

You don't believe me? Just lookup your "known" record in the database.
You won't find it. It does not exist. You will only find it if you
know its primary key value. And for the record to have a primary key
value, it must exist in the database first.

True, it won't exist. But you don't necessarily know that unless you
check.

Wrong. That is what I am trying to tell you all the time. Whether you
should update or insert the record is something you know beforehand. If
the user selects a card from editing, that card is retrieved from the
database and marked (with any means you like) as existing. You should
now be able to change any DATA value of the record.
If the user clicked "enter new card" or just did not select an existing
record, he is entering a new card which is marked as such. So before I
edit anything, I already have checked the existence of the record.

And it's quite simple: existing records get updated, new records get
inserted.

By the way, if you take the credit card number as a primary key AND get
the existence from the database, you would only have to give someone
else's credit card number to hijack his card. If the credit card already
exists for another user in the database, you have a data validity
problem, NOT a key problem (unless you abuse data for a key, off course).
Oh, and if I correct a misspelled credit card in the system, I end up
with TWO credit cards. I told you this is trouble to no ends.

But apart from that, your code does know whether the record comes from
the database or from another source. If you can't trust your primary
keys, you could have the table wrapper just set a boolean when getting
the record from the database. The original question was: how do you
know that a record comes from the database? If you want to know that,
your database code has to be held responsible for keeping track of
that. If you can't do that with the primary key because it interferes
with data, just find another way. Anyhow, your table wrapper knows if
it came from the storage.

Again, the program itself should not know nor should it care about
primary keys.

Say what? The primary key uniquely identifies the record. Without it,
you don't even know which record to update. As shown above, you should
identify a record by its key, NOT by its data. That is why keys should
never relate to data.

May 14 '06 #12

Toby Inkster

Dikkie Dik wrote:

Say what? The primary key uniquely identifies the record. Without it,
you don't even know which record to update. As shown above, you should
identify a record by its key, NOT by its data. That is why keys should
never relate to data.

This is complete rubbish from someone who's clearly learned their database
design principles from _The Muppets' Big Book of RDBMS Programming_.

Yes, when identifying a particular row in a database table, you should use
*a* key. (Not necessarily the primary key though, as in many cases a table
will have alternative candidate keys that may be used. All things being
equal though, might as well use the primary key if we have the choice.)

*But*, the primary key will often relate to data. In the general case
there is no need to introduce an arbitrary additional "pkey int" column to
the table. (Though in certain specific situations it might be necessary,
or simply more convenient.)

http://en.wikipedia.org/wiki/Natural_key
versus
http://en.wikipedia.org/wiki/Surrogate_key

--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ http://tobyinkster.co.uk/contact

May 14 '06 #13

Toby Inkster

Dikkie Dik wrote:

Toby Inkster wrote:
For example, if I'm keeping a table of my customers' credit cards, I might
use:

customer_id (int, foreign key)
card_number (big int, primary key)
card_holder_name (varchar)
expiry_date (timestamp)

If I'm inserting a new record into the table then I already know the
primary key, don't I?
No, you don't. You know a FIELD value of your record, NOT its primary
key.

Look at the table structure. The credit card number *is* the primary key.
If I know the card details, I know the primary key.
Especially, primary key values should never be related to the data in
the record.

Yes it should.

Any field, or combination thereof, that will never be null and where
values will always be unique in a particular table is a "candidate key".
Any candidate key can be chosen to be the primary key.

In my example above, if I'm never going to need to store the same credit
card number twice, then I can use the card number as a primary key.

--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ http://tobyinkster.co.uk/contact

May 14 '06 #14

Kenneth Downs

Dikkie Dik wrote:

No, you don't. You know a FIELD value of your record, NOT its primary
key. That you wish to combine them (that is always going to get you in
trouble sooner or later) does not change anything. For clarity: A
primary key is only a unique pointer to a record and nothing more.
Especially, primary key values should never be related to the data in
the record.

A natural primary key is one composed of one or more values that make up
part of the data being recorded. It sounds like our OP is describing a
natural key. Natural keys are a fundamental building block of relational
theory, going right back to Codd, allowing access to data only by the name
of the column values, w/o respect to any implementation-specific pointer
values or record numbers.

Your post describes what we usually call a surrogate key, to distinguish
from a natural primary key. The surrogate is generaly described as a
column that holds no business meaning, but which is unique. They are
usually implemented as sequential integers because those are handy, but
GUID's are sometimes used as well. Relational theorists often confuse
these with pointers and declare them to be evil, but they do not violate
any relational principles.

To make it more confusing, sometimes a natural key can be
computer-generated, such as a sales order #. That looks like a surrogate
to many people but it becomes meaningful outside of the system using it, so
it is really a natural key.

--
Kenneth Downs
Secure Data Software, Inc.
(Ken)nneth@(Sec)ure(Dat)a(.com)

May 14 '06 #15

Kenneth Downs

C. wrote:

Thanks Kenneth - sounds interesting.

I had a quick look....

How do you deal with the problem of updating the primary key of a
record? Cascade update/deletes?

TIA,

C.

Generally speaking I don't. :(

The product right now prohibits changes to PK values.

It is a fairly minor change to enable cascading PK values, but I don't know
when, in the next few weeks, it will actually get done.

Originally we thought that prohibiting changes was a Good Thing, and this
held true in experience until, ironically, we created the online docs as an
Andromeda system. The drafting and redrafting process leads to a desire to
change the name of a page, which is a PK change. So it will get in there
sooner or later in the next few weeks.

It has been on my personal list for a few weeks, and I've just added it to
the online documentation as a requirement for release 1.

--
Kenneth Downs
Secure Data Software, Inc.
(Ken)nneth@(Sec)ure(Dat)a(.com)

May 14 '06 #16

Tim Van Wassenhove

On 2006-05-14, Toby Inkster <us**********@tobyinkster.co.uk> wrote:

Any field, or combination thereof, that will never be null and where
values will always be unique in a particular table is a "candidate key".
Any candidate key can be chosen to be the primary key.

<nitpicking>

If each entity can be uniquely identified via it's properties a and b, then each
entity can also be uniquely identified by it's properties a, b and c.

So i would say: any 'minimal' candidate key can choosen as primary key.
</nitpicking>

--
Met vriendelijke groeten,
Tim Van Wassenhove <http://timvw.madoka.be>

May 14 '06 #17

Kenneth Downs

Colin McKinnon wrote:

In my search I came across a couple of systems which seemed to come close
to what I wanted, but both used a static description of the data structure
which was maintained independently of the DBMS - yes that causes problems.

I would argue from experience that it solves far more problems than it
causes.

If we pull back to the highest architectural view, you have these two big
things called "the db server" and "the application" (wherein the
application may be further divided into web server + browser etc).

The application must know the structure of the tables or it can't do
anything. If you go the non-abstract route, you code directly to the
tables, and you have to change a lot of code when you change tables. If
you go the abstract route and have libraries that perform standard
functions, then they must be provided with a data dictionary that tells
them about the tables.

So if we take it that database abstraction implies a data dictionary, the
question becomes, where is the data dictionary? It can either come out of
the server as a result of scripts that have built the database (read:
INFORMATION_SCHEMA), or it can be outside the server and used to build the
database.

If you locate the data dictionary outside the server, you can in fact
automate every aspect of the software cycle. Database creation, database
updates, deployment to remote servers, application abstraction,
documentation, testing, platform independence and change management all
benefit from the existence of the dictionary, since they all need it.

All of this is database abstraction taken to its logical conclusion.
Abstracting database access in the application is easy enough, there are
countless classes and frameworks out there that seek to do it, but it does
nothing for the entire software cycle. If you wish to abstract the entire
cycle and gain the benefits from soup to nuts and through successive
iterations than you need a dictionary that is outside both the server and
the application, and which is used to generate components for both, and
which governs both.

--
Kenneth Downs
Secure Data Software, Inc.
(Ken)nneth@(Sec)ure(Dat)a(.com)

May 14 '06 #18

Jerry Stuckle

Dikkie Dik wrote:

<snip>
You don't believe me? Just lookup your "known" record in the
database. You won't find it. It does not exist. You will only find it
if you know its primary key value. And for the record to have a
primary key value, it must exist in the database first.

True, it won't exist. But you don't necessarily know that unless you
check.

Wrong. That is what I am trying to tell you all the time. Whether you
should update or insert the record is something you know beforehand. If
the user selects a card from editing, that card is retrieved from the
database and marked (with any means you like) as existing. You should
now be able to change any DATA value of the record.
If the user clicked "enter new card" or just did not select an existing
record, he is entering a new card which is marked as such. So before I
edit anything, I already have checked the existence of the record.

And it's quite simple: existing records get updated, new records get
inserted.

By the way, if you take the credit card number as a primary key AND get
the existence from the database, you would only have to give someone
else's credit card number to hijack his card. If the credit card already
exists for another user in the database, you have a data validity
problem, NOT a key problem (unless you abuse data for a key, off course).
Oh, and if I correct a misspelled credit card in the system, I end up
with TWO credit cards. I told you this is trouble to no ends.

First of all, I didn't say credit card number. But over the last 25+ years,
I've worked on numerous systems - from banking to manufacturing to accounting
and beyond. And I've worked on PC based systems getting 10-20 queries a day,
and mainframe systems getting hundreds of thousands of queries a minute.

And no, an auto-increment field is not always the way to go.

But apart from that, your code does know whether the record comes
from the database or from another source. If you can't trust your
primary keys, you could have the table wrapper just set a boolean
when getting the record from the database. The original question was:
how do you know that a record comes from the database? If you want to
know that, your database code has to be held responsible for keeping
track of that. If you can't do that with the primary key because it
interferes with data, just find another way. Anyhow, your table
wrapper knows if it came from the storage.

Again, the program itself should not know nor should it care about
primary keys.

Say what? The primary key uniquely identifies the record. Without it,
you don't even know which record to update. As shown above, you should
identify a record by its key, NOT by its data. That is why keys should
never relate to data.

Exactly. The primary key is a DATABASE construct. A SSN may very well be the
unique identifier. In many systems such as banking and payroll, it is. An
account number may be the primary key in a number of systems - from banking to
shopping. Airlines use the frequent flier number as the primary key for their
frequent fliers, just as hotels use it for their frequent customers.

It is QUITE frequent to use data as primary keys, when the data itself is unique.
--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================

May 14 '06 #19

Colin McKinnon

Thanks to everybody - its welcome to see such informed debate at
c.l.p.

It may be relevant to point out that I used to earn my living developing
RAD applications in MS-Access. It was great for battering out
applications really quickly. It was not nearly so good for collaborative
and large scale development work. What really put me off it - and one of
the main reasons I began developing in PHP is that it is a total PITA
to distribute and maintain applications. So currently I am trying to
seperate what was good in MS-Access from the bad bits and recreate them in
PHP. This is not always as easy as it sounds - avoiding control inversion
in the UI component was difficult - but I think I acheived it sensibly in
PfP Studio. Now I'm trying to do the same with the database.

Kenneth Downs (Secure Data Software, Inc) was the only person brave enough
to suggest a tool to meet my stream-of-consciousness ramblings about
data-abstraction layers in PHP. His library certainly looks viable as a
basis for building my own solution on. As he rightly points out, since the
SQL DDL is inadequate for describing the structure of a relational database
then maybe the right approach is to use a meta-DDL to develop the database
and drive the coding of the PHP.

It was fairly predictable that the issue of primary keys would come up.
Dikkie Dik pointed out that using surrogate keys solves several issues
in that the value is constant after creation (it also makes it simpler
to adapt boilerplate scripts to different applications) however that
has to be balanced against the long term costs caused by undermining
the relational model. I have yet to see any compelling evidence for
using a surrogate key over a natural one - I think Toby Inkster agrees
with me there.

So thanks for your feedback, I have enough material now to get me totally
confused ;) but hopefully I'll hammer out something sensible from it. Of
course if someone beats me to it, I hope I'll hear about here.

C.

May 14 '06 #20

Henk Verhoeven

Colin McKinnon wrote:

Anybody any suggestions?

Hi Colin,

Imho a database abstraction layer is used to make the same application
code work on several databases. It sounds like what you are looking for
is more like an application framework. If you are willing to replace "an
array of data" by "an object holding data", phpPeanuts can take care of
either INSERT or UPDATE the data to MySQL, depending on the result of
$this->isNew(). The code that does it is actually situated on a class
that will serve as a database abstraction layer once a need for the
usage of several databases arises. Of course with objects you get much
more abstraction then just database abstraction, like polymorphism,
navigation over relationships. PhpPeanuts also adds navigational queries
(also available in QBE User Interface). Using SQL directly is possible
in several places, but has the disadvantage that the proper JOIN's will
not be made automatically, like it is done if you use the Query Model.

Greetings,

Henk Verhoeven,
www.phpPeanuts.org.

May 14 '06 #21

Kenneth Downs

Colin McKinnon wrote:

It was fairly predictable that the issue of primary keys would come up.
Dikkie Dik pointed out that using surrogate keys solves several issues
in that the value is constant after creation (it also makes it simpler
to adapt boilerplate scripts to different applications) however that
has to be balanced against the long term costs caused by undermining
the relational model. I have yet to see any compelling evidence for
using a surrogate key over a natural one - I think Toby Inkster agrees
with me there.

I've found in practice that surrogates only get you into trouble if you use
them as foreign keys. Can't offer any high-minded explanation for this, it
just seems to be the case. The practice is to have a natural primary key
that has some business meaning and also a surrogate key. The primary key
is used for foreign keys, the surrogate key is used only to simplify tasks
like updates and deletes after the row has been retrieved.

Every insert command also returns the resulting skey as a NOTICE, so you can
further manipulate the rwo if necessary.

Andromeda adds a surrogate key named "SKEY" for free to every table. This
makes it possible for the framework, once it has retrieved a set of rows,
to have very simple routines based column "SKEY" that do stuff like
page-by-page navigation, row selection, updates, and deletes.

Thanks also for your comments, I hope that you will find some use for
Andromeda.

--
Kenneth Downs
Secure Data Software, Inc.
(Ken)nneth@(Sec)ure(Dat)a(.com)

May 15 '06 #22

Manuel Lemos

Hello,

on 05/14/2006 06:39 PM Colin McKinnon said the following:

It may be relevant to point out that I used to earn my living developing
RAD applications in MS-Access. It was great for battering out
applications really quickly. It was not nearly so good for collaborative
and large scale development work. What really put me off it - and one of
the main reasons I began developing in PHP is that it is a total PITA
to distribute and maintain applications. So currently I am trying to
seperate what was good in MS-Access from the bad bits and recreate them in
PHP. This is not always as easy as it sounds - avoiding control inversion
in the UI component was difficult - but I think I acheived it sensibly in
PfP Studio. Now I'm trying to do the same with the database.

Before you re-invent the wheel, you may want to take a look at Metabase.

It was the first PHP database abstraction layer to focus on total
database application portability.

This means that not only it provides an API that lets you write database
applications that work without a change to access many RDBMS (MySQL,
PostgreSQL, Oracle, MS SQL server, MS Access, SQLite, Interbase, etc..),
but it also lets you install or upgrade database schemas in a RDBMS
independent way.

Metabase can take database schema definitions in a RDBMS independent XML
format. Then it can install the schema executing the necessary DDL
statements for creating, tables, indexes, keys, sequences, etc..

The greatest part, is that when you want to upgrade the schema of an
already installed application, all you need to do is to change your
schema definition files and ask Metabase to compare and apply the
changes. The schema is upgraded without loosing records added to the
database since it was installed for the first time or upgraded for the
last time.

Metabase is available here:

http://www.phpclasses.org/metabase

The tutorial and other documentation is available here:

http://www.meta-language.net/documen....html#metabase

--

Regards,
Manuel Lemos

Metastorage - Data object relational mapping layer generator
http://www.metastorage.net/

PHP Classes - Free ready to use OOP components written in PHP
http://www.phpclasses.org/

May 15 '06 #23

frej

Hi,

I did not read all the messages but i think that DB from
http://pear.php.net would fit your needs?
Take a look at DB_common:autoExecute()

Frej

May 15 '06 #24

Hi Manuel,

It's on my list of things to read next - along with PDOs and
PHP-Creole.

C.

May 15 '06 #25

Hi Henk,

I've had a look at PHPPeanuts and its very impressive. I thought it
would be harder to take it apart to get the bits I want than to tackle
the problem from the ground up. I'll take a look at its inner workings
next time.

C.

May 15 '06 #26

Database abstraction layers

Similar topics