Robert Brown (ro*************@yahoo.com) writes:
Our customer (of our ecommerce system) wants to be able to preserve
deleted entities in the database so that they can do reporting,
auditing etc.
The system is quite complex where each end user can belong to multiple
institutional affiliations (which can purchase on behalf of the user).
The end user also has a rich trail of past transactions affiliations
etc. Thus in the schema each user entity is related to many others
which in turn relate to yet others and so on.
In the past when a user was deleted all of his complex relationships
were also deleted in a cascading fashion. But now the customer wants
us to add a "deleted" flag to each user so that a user is never
_really_ deleted but instead his "deleted" flag is set to true. The
system subsequently behaves as if the user did not exist but the
customer can still do reports on deleted users.
I pointed out that it is not as simple as that because the user entity
is related to many, many others so we would have to add this "deleted"
flag to every relationship and every other entity and thus have
"deleted" past purchases, "deleted" affiliations - a whole shadow
schema full of such ghost entities. This would overtime degrade
performance since now each query in the system has to add a clause:
"where deleted = 0".
Indeed, this is quite a big change if this was not in the system from
the beginning.
In our system, many items are not deletable, because even if an item
goes away, there might still be plenty of references to it. So in these
tables - accounts, customers, instruments, currencies, to name a few - there
is a deregdate column. This column serves the double purpose of telling
us if the item is still active, and if it is not, when the entity was
deregistered. And, yes it happens that deregistered entities are revived
too!
A non-trivial issue here is to know when a deregistered item should be
included and when it should not. If you are producing a list of last
month's tranactions, it obviously should. But if you are populating a
list of available products to order, deregistered products should not be
included. So when you introduce this concept in your system, you have
a lot to write in your functional specification too.
Another issue which becomes complicated, is referential integrity.
accounts.accresponsible may refer to the users table, but if the account
is active, the user must be too. Currently we do this in triggers, which
is a bit complex, and more difficult than foreign-key constraints. One
thought I've been playing with is to have tables like active_accounts,
active_users etc. This would not be the partition suggested by others,
since active_accounts would only hold the account number, and foreign
keys to other deregisterable items. Thus, active_accounts.accresponsible
would refer to active_users.userid. The full data would still be in
acconts and users, for both active and deregistered items.
I have never considered the performance cost for "AND deregdate IS NULL",
but I would suggest that if you need to access that column, you probably
already access some column which is not in any index, so there is
already a bookmark lookup, so I would not expect any particular penalty.
(This applies to MS SQL Server. Not that I really expect Oracle to be
different, but I don't know Oracle.)
--
Erland Sommarskog, SQL Server MVP,
so****@algonet.se
Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp