Stefan Ram wrote:
"Andy Dingley" <di*****@codesm iths.comwrites:
Aspects like keys are quite specific to a specific implementation
through an RDBMS and it's not necessarily important to preserve them.
Keys are inherent to the relational model of the /data/ -
they are not an implementation detail of a specific RDBMS.
The concept of "keys" is relevant to any current "relational "
implementation of the data.
However the _specific_ use of keys is specific to the implementation.
There's a question of how far you normalise your data when designing a
relational model for it. You don't have to normalise to the same form
each time, and you don't have to use identical key structures.
If we see this "XML output" of the database model as being application
centric, then we don't care about such design choices. No matter how
normalised the data was when it was stored internally, we want the same
denormalised view for the output. As different implementations may have
used a different data model (an Access implementation was probably
de-normalised compared to a SQL Server implementation) , this difference
is now irrelevant, inappropriate and possibly misleading. Our XML
representation shouldn't preserve these keys.
I only know about them from the XML-application "XHTML 1.1",
where the id-attribute has ID-type and "for" and "usemap" have
IDREF-type, and was not aware of problems with this approach.
Read the RDF documentation. Much of RDF's work was in overcoming the
shortcomings of XML, in providing a usable data model for ID &
IDREF-like concepts.
XML has two major shortcomings here:
* To use IDREF, you must first have an ID. What happens if you want to
refer to a node that's identifiable, but not explicitly labelled ? It's
a valid requirement.
* ID & IDREF only work within a single document. To make small
appplications that can inter-work in a large universe, we need tools
that can refer outside their immediate frame of reference. XPointer is
an attempt here, but there's still a lot lacking with XML in this
context.
But it's hardly practical, is it ?
It's not so hard storing relational data in XML with one
element per set and one element per tuple, in fact, this
seems quite natural to me.
What's a "tuple" here ? A tuple as held in a table, or a tuple as a
row in a relational view ? I have no real interest in
tuples-from-tables, they're too low level and only really useful for
"database replication between databases with identical table structures
and data models".
If we look at the more interesing case of tuples from a view, then
these will be de-normalised (i.e. they have structure that would have
been normalised into multiple tables). An appropriate XML
representation of these is also normalised. Now we can still say "one
element per tuple" simply, but it has to become "one parent element for
one or more tuples" and "potentiall y more than one level of element
hierarchy within a tuple"
I strongly recommend studying MS SQL 2000 and the splendid hack with
which they implemented the "AS XML" select query, without changing
anything in the database itself. If you search the MSDN SDK for
"Universal table" then there's a good explanation of it. Basically any
"AS XML" query produces a huge denormalised scratch table called the
"Universal table", then a trivial row scanner runs through this and
generates new element hierarchies when column values change. Quite
useful, and a splendid low-effort hack.