473,396 Members | 2,013 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

XML and Ontologies

I am interested in XML mediation and the use of ontlogies to link
similar but different element names in XML schema. Am I correct in my
understanding that an onltology is a language or set of commands that
is agreed upon thus making mediation between XML element names
uneccesary. Also is this the best method of mediation between XML
files.
thanks for any help
Alex
Jul 20 '05 #1
2 1939
On 10 Jul 2003 05:35:27 -0700, al**************@hotmail.com (Alex
Fawcett) wrote:
I am interested in XML mediation and the use of ontlogies to link
similar but different element names in XML schema.
XML is a bit of an unhappy fit with ontologies - you start to
appreciate the differences between RDF and XML.

I suggest giving Protégé a whirl
http://protege.stanford.edu

It's an environment for editing both ontologies and instance data, in
a very approachable style. Certainly worth a look.

I spent much of last week here:
http://protege.stanford.edu/workshop_vi/schedule.html
and blogged a brief trip report here:
http://www.livejournal.com/users/quercus/20830.html
It's a frames-based approach, rather than a description logics
approach. This makes big differences, but you need to get a little
hands-on with both (and frames is perhaps simpler to start with). We
don't know where we'll end up finally, and we might need to combine
both approaches.

Take a look at the W3C's OWL (Web Ontology Language) and the older
work (SHOE, OIL, DAML+OIL) too. These are generally DL-based (take a
look at Manchester's OilEd, if you want a contrast to Protégé)

Am I correct in my
understanding that an onltology is a language or set of commands that
is agreed upon thus making mediation between XML element names
uneccesary.
What's an ontology ? I've written the "30 second elevator pitch" on
this about a dozen times over the last few years. It's very hard to
give one simple definition that meets all needs. Everyone who comes to
ontologies (and it's almost a stampede now) approaches from a
different angle.

Natasha Noy 's classic paper "Ontology 101" is a good place to start
http://protege.stanford.edu/publicat...cguinness.html

Broadly, I'd say that it was one definition of a set of entities and
their related properties, expressed in a style that was understood by
other systems.

It may also describe their metaphysical "meanings", which is the
difference between an ontology and a schema (or between DAML and
DAML+OIL)

An ontology does not describe mappings or mediation between two XML
schemas. Depending on your meaning of "mediation" this might be easy
(if you know they're ontologically identical, but you just need to
match up the names), but mapping is generally speaking a fiendishly
difficult problem.

You can approach it with ontologies. You use two ontologies,
describing both the source and target. Then you apply some form of
complex reasoning to identify commonality and as much "mapping" as is
possible. From this you then generate (or auto-generate) code to do
the mapping. Easy.

The problem is that any ontology beyond the trivial has no simple
mapping between entities. Does an employee have a "works-for"
relationship with their boss, or a "works-in" with their department
and a "manages" relationship between boss and department ? This stuff
just doesn't overlay cleanly, so an improved description technique
alone isn't going to fix things.
Also is this the best method of mediation between XML
files.


Depends on the scale of your problem. What's an "XML file" ? Are
these the same two schemas you see every day, or is it a dynamic
problem with every new message ? How different are the two models ?

Incidentally, the same problem between one XML document and an RDBMS
is also common.

There's a lot of very rudimentary work being passed off around this
problem (Oracle 9i being a case in point) where people in suits with a
product to sell are pushing very simple (often XSLT-based) solutions
as a panacea. Those who are seriously in the field know it's not so
easy.
There's also the problem of meta-languages. Many people are already
encountering this with database output, and it has a huge effect on
the use of XSLT.

Consider an RDBMS with a generic XML export filter. What should the
output look like ?

<order>
<order-item>
<a>1</a><b>2</b>
</order-item>
<order-item>
<b>3</b><c>4</c>
</order-item>
</order>
<query name="order" >
<row name="order-item" >
<column name="a" >1</column><column name="b" >2</column>
</row>
<row name="order-item" >
<column name="b" >3</column><column name="c" >4</column>
</row>
</query>
The first of these maps column names onto element names. It generates
comapct XML that's probably how most XML coders would do it manually.
The trouble is that it's a new DTD for every query.

The second is a meta-format. The DTD is the same for every query
output and only the name="" metadata changes. It's verbose (but we
don't care, because our computers deal with that for us)

Ontologically these are _identical_ (they ought to be, or our export
filter is broken). In terms of ease of use though, they're quite
different. The first is unstable and somewhat unpredictable
(although you can easily auto-export a DTD or even ontology at the
same time), the second is hard to process (with XSLT).

XSLT is a language for transformtions of XML data at the structural
level. This works fine for our "type 1" data above, or for much XML,
because XML's data model is inferred from the structure (go read
XML-Infoset). A structural transformation _is_ a transformation at the
level of the data-model.

The second one becomes much harder. We've now separated the structural
level (and the data model of our consistent "generic export format")
from the data model of our "real" data. An XSLT transform still
operates at the structural level (it has to - that's what XSLT does)
and so it's now divorced from the level the interesting data is
residing at. Using XSLT to make real "data-level" transformations
like this becomes a real PITA. In some formats it's straightforward,
but long-winded, in others (like RDF) it becomes well-nigh impossible.
Schematron can sometimes help.

RDF is a bit like "type 2" data, with a "generic export format" that's
already defined by the RDF/XML standards. You can't work with
non-trivial RDF in XSLT, because of just this problem. That's why RDF
is manipulated by tools such as Jena, that work at the data model
level.
Jul 20 '05 #2
In message <ek********************************@4ax.com>, Andy Dingley
<di*****@codesmiths.com> writes
On 10 Jul 2003 05:35:27 -0700, al**************@hotmail.com (Alex
Fawcett) wrote:
I am interested in XML mediation and the use of ontlogies to link
similar but different element names in XML schema.

Am I correct in my
understanding that an onltology is a language or set of commands that
is agreed upon thus making mediation between XML element names
uneccesary.


Further to Andy's excellent thoughts on this issue, I would add the
suggestion that you could look into using Topic Maps
(http://www.topicmaps.org/) to represent equivalences between concepts
in schemas. As it happens, I was doing exactly this only last week, as
preparation for a data mapping exercise.

I took the two schemas I wanted to compare, and used XSLT to convert
them to Topic Maps. I then wrote a "links" document containing
relationships between individual concepts. As it happens, I wrote this
in the sort of "compact" style Andy described, e.g.:

<link type="exact">
<member schema="nt" id="condition-check"/>
<member schema="spectrum" id="check"/>
</link>

but I could easily use XSLT to convert this to a proper Topic Map
(containing nothing but Associations).

What I actually did was to convert this "links" document into an HTML
table of links between equivalent concepts in the two schemas. This was
sufficient for the task at hand.

In principle I could instead have made my "links" document into a TM in
its own right, and then used it to merge the two schemas into a single
TM with all the correspondences expressed as TM Associations. This sort
of approach lets you work at a higher level of abstraction than the raw
XML (i.e. at a "Topic Map concepts" level). Conversely, TM XML is
pretty simple (if verbose) in its structure, so you may get more mileage
using XSLT than Andy suggests you would with RDF.

Richard Light
--
Richard Light
SGML/XML and Museum Information Consultancy
ri*****@light.demon.co.uk

Jul 20 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Craig S. Ugoretz | last post by:
Dear newgroup readers, I am pleased to announce the presentation of the new open source software project I have created, called the "Wisdom Seeker IDE". For a description of what open source...
0
by: cognite | last post by:
This venue would surely appreciate the cool stuff being done in python on bioinformatics and python's tools for info parsing and extraction (like fulltext indexing, xml tools, parser builders,...
7
by: Florian Lindner | last post by:
Hello, I'm looking for a program or python library to draw graphs. They should look like the that: /--------\ /--------\ | Node A | ------ belongs to ----> | Node B |...
2
by: David Allen | last post by:
I've been recently reading up on OWL and some of the topics related to the Semantic Web. It seems though that while OWL became an official recommendation a while back, there isn't much use of it...
4
by: Fredrik Henricsson | last post by:
Hey, I'm building an ontology in Protégé and I want to transform parts of it (e.g. the instances) to HTML with XSL. When I was transforming another file with 'simple' XML-tags like <author> before,...
10
by: Andy Dingley | last post by:
Semantic Web emerges as commercial-grade infrastructure for sharing data on the Web http://www.w3c.org/2004/01/sws-pressrelease New docs are up - congrats to Brian and everyone else...
3
by: gxdata | last post by:
Anyone doing other than XML Schemas in dotNET Framework? I'm intrested in an editor that does OWL variants, and (badly needed) something like the Protege project with its extensibility. Is this...
0
by: heidan | last post by:
Dear, I have a question of representing the relationship of sales invoice's posting date and post period into RDF. Let's say every sales invoice has a posting date which records when this...
8
by: Grant Robertson | last post by:
I am considering purchasing either XML Spy or Stylus Studio soon. I can get Stylus Studio Enterprise edition for $350 at academic pricing. The enterprise edition of XML Spy is about $750 which is...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.