473,387 Members | 1,464 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

XML-schema 'best practice' question

Hi all

This is not strictly a Python question, but as I am writing in Python,
and as I know there are some XML gurus on this list, I hope it is
appropriate here.

XML-schemas are used to define the structure of an xml document, and
to validate that a particular document conforms to the schema. They
can also be used to transform the document, by filling in missing
attributes with default values.

In my situation, both the creation and the processing of the xml
document are under my control. I know that this begs the question 'why
use xml in the first place', but let's not go there for the moment.

Using minixsv, validating a document with a schema works, but is quite
slow. I appreciate that lxml may be quicker, but I think that my
question is still applicable.

I am thinking of adding a check to see if a document has changed since
it was last validated, and if not, skip the validation step. However,
I then do not get the default values filled in.

I can think of two possible solutions. I just wondered if this is a
common design issue when it comes to xml and schemas, and if there is
a 'best practice' to handle it.

1. Don't use default values - create the document with all values
filled in.

2. Use python to check for missing values and fill in the defaults
when processing the document.

Or maybe the best practice is to *always* validate a document before
processing it.

How do experienced practitioners handle this situation?

Thanks for any hints.

Frank Millman
Sep 18 '08 #1
5 2859
On 18 Set, 08:28, Frank Millman <fr...@chagford.comwrote:
I am thinking of adding a check to see if a document has changed since
it was last validated, and if not, skip the validation step. However,
I then do not get the default values filled in.

I can think of two possible solutions. I just wondered if this is a
common design issue when it comes to xml and schemas, and if there is
a 'best practice' to handle it.

1. Don't use default values - create the document with all values
filled in.

2. Use python to check for missing values and fill in the defaults
when processing the document.

Or maybe the best practice is to *always* validate a document before
processing it.
The stated problem rings a lot of premature optimization bells;
performing the validation and default-filling step every time,
unconditionally, is certainly the least crooked approach.

In case you really want to avoid unnecessary schema processing, if you
are willing to use persistent data to check for changes (for example,
by comparing a hash or the full text of the current document with the
one from the last time you performed validation) you can also store
the filled-in document that you computed, either as XML or as
serialized Python data structures.

Regards,
Lorenzo Gatti
Sep 18 '08 #2
Frank1. Don't use default values - create the document with all values
Frankfilled in.

Frank2. Use python to check for missing values and fill in the defaults
Frankwhen processing the document.

FrankOr maybe the best practice is to *always* validate a document
Frankbefore processing it.

FrankHow do experienced practitioners handle this situation?

3. Don't use XML.

(sorry, couldn't resist)

Skip
Sep 18 '08 #3
On Sep 18, 8:28*am, Frank Millman <fr...@chagford.comwrote:
Hi all

This is not strictly a Python question, but as I am writing in Python,
and as I know there are some XML gurus on this list, I hope it is
appropriate here.

XML-schemas are used to define the structure of an xml document, and
to validate that a particular document conforms to the schema. They
can also be used to transform the document, by filling in missing
attributes with default values.
[..]
>
Or maybe the best practice is to *always* validate a document before
processing it.
I have realised that my question was irrelevant.

xml's raison d'etre is to facilitate the exchange of information
between separate entities. If I want to use xml as a method of
serialisation within my own system, I can do what I like, but there
can be no question of 'best practice' in this situation.

When xml is used as intended, and you want to process a document
received from a third party, there is no doubt that you should always
validate it first before processing it. Thank you, Lorenzo, for
pointing out the obvious. It may take me a while to catch up, but at
least I can see things a little more clearly now.

As to why I am using xml at all, I know that there is a serious side
to Skip's light-hearted comment, so I will try to explain.

I want to introduce an element of workflow management (aka Business
Process Management) into the business/accounting system I am
developing. I used google to try to find out what the current state of
the art is. After several months of very confusing research, this is
the present situation, as best as I can figure it out.

There is an OMG spec called BPMN, for Business Process Modeling
Notation. It provides a graphical notation, intended to be readily
understandable by all business users, from business analysts, to
technical developers, to those responsible for actually managing and
monitoring the processes. Powerful though it is, it does not provide a
standard method of serialsing the diagram, so there is no standard way
of exchanging a diagram between different vendors, or of using it as
input to a workflow engine.

There is an OASIS spec called WS-BPEL, for Web Services Business
Process Execution Language. It defines a language for specifying
business process behavior based on Web Services. This does have a
formal xml-based specification. However, it only covers processes
invoked via web services - it does not cover workflow-type processes
within an organisation. To try to fill this gap, a few vendors got
together and submitted a draft specification called BPEL4People. This
proposes a series of extensions to the WS-BPEL spec. It is still at
the evaluation stage.

The BPMN spec includes a section which attempts to provide a mapping
between BPMN and BPEL, but the authors state that there are areas of
incompatibility, so it is not a perfect mapping.

Eventually I would like to make sense of all this, but for now I want
to focus on BPMN, and ignore BPEL. I can use wxPython to design a BPMN
diagram, but I have to invent my own method of serialising it so that
I can use it to drive the business process. For good or ill, I decided
to use xml, as it seems to offer the best chance of keeping up with
the various specifications as they evolve.

I don't know if this is of any interest to anyone, but it was
therapeutic for me to try to organise my thoughts and get them down on
paper. I am not expecting any comments, but if anyone has any thoughts
to toss in, I will read them with interest.

Thanks

Frank
Sep 20 '08 #4
On 20 Set, 07:59, Frank Millman <fr...@chagford.comwrote:
I want to introduce an element of workflow management (aka Business
Process Management) into the business/accounting system I am
developing. I used google to try to find out what the current state of
the art is. After several months of very confusing research, this is
the present situation, as best as I can figure it out.
What is the state of the art of existing, working software? Can you
leverage it instead of starting from scratch? For example, the
existing functionality of your accounting software can be reorganized
as a suite of components, web services etc. that can be embedded in
workflow definitions, and/or executing a workflow engine can become a
command in your application.
There is an OMG spec called BPMN, for Business Process Modeling
Notation. It provides a graphical notation
[snip]
there is no standard way
of exchanging a diagram between different vendors, or of using it as
input to a workflow engine.
So BPMN is mere theory. This "spec" might be a reference for
evaluating actual systems, but not a standard itself.
There is an OASIS spec called WS-BPEL, for Web Services Business
Process Execution Language. It defines a language for specifying
business process behavior based on Web Services. This does have a
formal xml-based specification. However, it only covers processes
invoked via web services - it does not cover workflow-type processes
within an organisation. To try to fill this gap, a few vendors got
together and submitted a draft specification called BPEL4People. This
proposes a series of extensions to the WS-BPEL spec. It is still at
the evaluation stage.
Some customers pay good money for buzzword compliance, but are you
sure you want to be so bleeding edge that you care not only for WS-
something specifications, but for "evaluation stage" ones?

There is no need to wait for BPEL4People before designing workflow
systems with human editing, approval, etc.
Try looking into case studies of how BPEL is actually used in
practice.
The BPMN spec includes a section which attempts to provide a mapping
between BPMN and BPEL, but the authors state that there are areas of
incompatibility, so it is not a perfect mapping.
Don't worry, BPMN does not exist: there is no incompatibility.
On the other hand, comparing and understanding BPMN and BPEL might
reveal different purposes and weaknesses between the two systems and
help you distinguish what you need, what would be cool and what is
only a bad idea or a speculation.
Eventually I would like to make sense of all this, but for now I want
to focus on BPMN, and ignore BPEL. I can use wxPython to design a BPMN
diagram, but I have to invent my own method of serialising it so that
I can use it to drive the business process. For good or ill, I decided
to use xml, as it seems to offer the best chance of keeping up with
the various specifications as they evolve.
If you mean to use workflow architectures to add value to your
business and accounting software, your priority should be executing
workflows, not editing workflow diagrams (which are a useful but
unnecessary user interface layer over the actual workflow engine);
making your diagrams and definitions compliant with volatile and
unproven specifications should come a distant last.
I don't know if this is of any interest to anyone, but it was
therapeutic for me to try to organise my thoughts and get them down on
paper. I am not expecting any comments, but if anyone has any thoughts
to toss in, I will read them with interest.

1) There are a number of open-source or affordable workflow engines,
mostly BPEL-compliant and written in Java; they should be more useful
than reinventing the wheel.

2) With a good XML editor you can produce the workflow definitions,
BPEL or otherwise, that your workflow engine needs, and leave the
interactive diagram editor for a phase 2 that might not necessarily
come; text editing might be convenient enough for your users, and for
graphical output something simpler than an editor (e.g a Graphviz
exporter) might be enough.

3) Maybe workflow processing can grow inside your existing accounting
application without the sort of "big bang" redesign you seem to be
planning; chances are that the needed objects are already in place and
you only need to make workflow more explicit and add appropriate new
features.

Regards,
Lorenzo Gatti
Sep 20 '08 #5
Sorry for pressing the send button too fast.

On 20 Set, 07:59, Frank Millman <fr...@chagford.comwrote:
I want to introduce an element of workflow management (aka Business
Process Management) into the business/accounting system I am
developing. I used google to try to find out what the current state of
the art is. After several months of very confusing research, this is
the present situation, as best as I can figure it out.
What is the state of the art of existing, working software? Can you
leverage it instead of starting from scratch? For example, the
existing functionality of your accounting software can be reorganized
as a suite of components, web services etc. that can be embedded in
workflow definitions, and/or executing a workflow engine can become a
command in your application.
There is an OMG spec called BPMN, for Business Process Modeling
Notation. It provides a graphical notation
[snip]
there is no standard way
of exchanging a diagram between different vendors, or of using it as
input to a workflow engine.
So BPMN is mere theory. This "spec" might be a reference for
evaluating actual systems, but not a standard itself.
There is an OASIS spec called WS-BPEL, for Web Services Business
Process Execution Language. It defines a language for specifying
business process behavior based on Web Services. This does have a
formal xml-based specification. However, it only covers processes
invoked via web services - it does not cover workflow-type processes
within an organisation. To try to fill this gap, a few vendors got
together and submitted a draft specification called BPEL4People. This
proposes a series of extensions to the WS-BPEL spec. It is still at
the evaluation stage.
Some customers pay good money for buzzword compliance, but are you
sure you want to be so bleeding edge that you care not only for WS-
something specifications, but for "evaluation stage" ones?

There is no need to wait for BPEL4People before designing workflow
systems with human editing, approval, etc.
Try looking into case studies of how BPEL is actually used in
practice.
The BPMN spec includes a section which attempts to provide a mapping
between BPMN and BPEL, but the authors state that there are areas of
incompatibility, so it is not a perfect mapping.
Don't worry, BPMN does not exist: there is no incompatibility.
On the other hand, comparing and understanding BPMN and BPEL might
reveal different purposes and weaknesses between the two systems and
help you distinguish what you need, what would be cool and what is
only a bad idea or a speculation.
Eventually I would like to make sense of all this, but for now I want
to focus on BPMN, and ignore BPEL. I can use wxPython to design a BPMN
diagram, but I have to invent my own method of serialising it so that
I can use it to drive the business process. For good or ill, I decided
to use xml, as it seems to offer the best chance of keeping up with
the various specifications as they evolve.
If you mean to use workflow architectures to add value to your
business and accounting software, your priority should be executing
workflows, not editing workflow diagrams (which are a useful but
unnecessary user interface layer over the actual workflow engine);
making your diagrams and definitions compliant with volatile and
unproven specifications should come a distant last.
I don't know if this is of any interest to anyone, but it was
therapeutic for me to try to organise my thoughts and get them down on
paper. I am not expecting any comments, but if anyone has any thoughts
to toss in, I will read them with interest.

1) There are a number of open-source or affordable workflow engines,
mostly BPEL-compliant and written in Java; they should be more useful
than reinventing the wheel.

2) With a good XML editor you can produce the workflow definitions,
BPEL or otherwise, that your workflow engine needs, and leave the
interactive diagram editor for a phase 2 that might not necessarily
come; text editing might be convenient enough for your users, and for
graphical output something simpler than an editor (e.g a Graphviz
exporter) might be enough.

3) Maybe workflow processing can grow inside your existing accounting
application without the sort of "big bang" redesign you seem to be
planning; chances are that the needed objects are already in place and
you only need to make workflow more explicit and add appropriate new
features.

Regards,
Lorenzo Gatti
Sep 20 '08 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
by: Robert J Egan | last post by:
Hi i'm trying to search a remote website page. The form returns xml information, though the page extension is missing. I retrieve the information and write it to the screen. So far so good -...
0
by: Steve Whitlatch | last post by:
It may be me, or it may be the Linux implementation of XML Catalogs on slackware. Whichever, please shed some light on this XML Catalog problem. When using the --catalogs option, xmllint resolves...
0
by: melledge | last post by:
The Reliable Source for Everything XML - XML 2005 Update XML 2005 - November 14-18 - Atlanta Hilton Hotel - Atlanta, GA www.xmlconference.org Register today and participate in IDEAlliance's...
1
by: Srini | last post by:
I have written two simple webservice functions and trying to consume them through a client piece. Both the webservice functions have similar signature....
3
by: Jonathan Buckland | last post by:
Can someone give me an example how to append data without having to load the complete XML file. Is this possible? Jonathan
5
by: OJO | last post by:
Hello microsoft.public.dotnet.xml! I need to parse some 'jabber xml' (www.jabber.org). I opted for using System.Xml.XmxDocument. The sample 'jabber xml' goes here: <message...
7
by: Michael | last post by:
Hi, I have a problem parsing XML file using XSLT stylesheet by using : using System.Xml; using System.Xml.XPath; using System.Xml.Xsl; // load Xsl stylesheet XslTransform myXslTrans = new...
5
by: laks | last post by:
Hi I have the following xsl stmt. <xsl:for-each select="JOB_POSTINGS/JOB_POSTING \"> <xsl:sort select="JOB_TITLE" order="ascending"/> This works fine when I use it. But when using multiple...
0
by: jts2077 | last post by:
I am trying to create a large nested XML object using E4X methods. The problem is the, the XML I am trying to create can only have xmlns set at the top 2 element levels. Such as: <store ...
10
by: =?Utf-8?B?YzY3NjIyOA==?= | last post by:
Hi all, I had a program and it always works fine and suddenly it gives me the following message when a pass a xml file to our server program: error code: -1072896680 reason: XML document must...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.