473,574 Members | 2,494 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

XML-schema 'best practice' question

Hi all

This is not strictly a Python question, but as I am writing in Python,
and as I know there are some XML gurus on this list, I hope it is
appropriate here.

XML-schemas are used to define the structure of an xml document, and
to validate that a particular document conforms to the schema. They
can also be used to transform the document, by filling in missing
attributes with default values.

In my situation, both the creation and the processing of the xml
document are under my control. I know that this begs the question 'why
use xml in the first place', but let's not go there for the moment.

Using minixsv, validating a document with a schema works, but is quite
slow. I appreciate that lxml may be quicker, but I think that my
question is still applicable.

I am thinking of adding a check to see if a document has changed since
it was last validated, and if not, skip the validation step. However,
I then do not get the default values filled in.

I can think of two possible solutions. I just wondered if this is a
common design issue when it comes to xml and schemas, and if there is
a 'best practice' to handle it.

1. Don't use default values - create the document with all values
filled in.

2. Use python to check for missing values and fill in the defaults
when processing the document.

Or maybe the best practice is to *always* validate a document before
processing it.

How do experienced practitioners handle this situation?

Thanks for any hints.

Frank Millman
Sep 18 '08 #1
5 2876
On 18 Set, 08:28, Frank Millman <fr...@chagford .comwrote:
I am thinking of adding a check to see if a document has changed since
it was last validated, and if not, skip the validation step. However,
I then do not get the default values filled in.

I can think of two possible solutions. I just wondered if this is a
common design issue when it comes to xml and schemas, and if there is
a 'best practice' to handle it.

1. Don't use default values - create the document with all values
filled in.

2. Use python to check for missing values and fill in the defaults
when processing the document.

Or maybe the best practice is to *always* validate a document before
processing it.
The stated problem rings a lot of premature optimization bells;
performing the validation and default-filling step every time,
unconditionally , is certainly the least crooked approach.

In case you really want to avoid unnecessary schema processing, if you
are willing to use persistent data to check for changes (for example,
by comparing a hash or the full text of the current document with the
one from the last time you performed validation) you can also store
the filled-in document that you computed, either as XML or as
serialized Python data structures.

Regards,
Lorenzo Gatti
Sep 18 '08 #2
Frank1. Don't use default values - create the document with all values
Frankfilled in.

Frank2. Use python to check for missing values and fill in the defaults
Frankwhen processing the document.

FrankOr maybe the best practice is to *always* validate a document
Frankbefore processing it.

FrankHow do experienced practitioners handle this situation?

3. Don't use XML.

(sorry, couldn't resist)

Skip
Sep 18 '08 #3
On Sep 18, 8:28*am, Frank Millman <fr...@chagford .comwrote:
Hi all

This is not strictly a Python question, but as I am writing in Python,
and as I know there are some XML gurus on this list, I hope it is
appropriate here.

XML-schemas are used to define the structure of an xml document, and
to validate that a particular document conforms to the schema. They
can also be used to transform the document, by filling in missing
attributes with default values.
[..]
>
Or maybe the best practice is to *always* validate a document before
processing it.
I have realised that my question was irrelevant.

xml's raison d'etre is to facilitate the exchange of information
between separate entities. If I want to use xml as a method of
serialisation within my own system, I can do what I like, but there
can be no question of 'best practice' in this situation.

When xml is used as intended, and you want to process a document
received from a third party, there is no doubt that you should always
validate it first before processing it. Thank you, Lorenzo, for
pointing out the obvious. It may take me a while to catch up, but at
least I can see things a little more clearly now.

As to why I am using xml at all, I know that there is a serious side
to Skip's light-hearted comment, so I will try to explain.

I want to introduce an element of workflow management (aka Business
Process Management) into the business/accounting system I am
developing. I used google to try to find out what the current state of
the art is. After several months of very confusing research, this is
the present situation, as best as I can figure it out.

There is an OMG spec called BPMN, for Business Process Modeling
Notation. It provides a graphical notation, intended to be readily
understandable by all business users, from business analysts, to
technical developers, to those responsible for actually managing and
monitoring the processes. Powerful though it is, it does not provide a
standard method of serialsing the diagram, so there is no standard way
of exchanging a diagram between different vendors, or of using it as
input to a workflow engine.

There is an OASIS spec called WS-BPEL, for Web Services Business
Process Execution Language. It defines a language for specifying
business process behavior based on Web Services. This does have a
formal xml-based specification. However, it only covers processes
invoked via web services - it does not cover workflow-type processes
within an organisation. To try to fill this gap, a few vendors got
together and submitted a draft specification called BPEL4People. This
proposes a series of extensions to the WS-BPEL spec. It is still at
the evaluation stage.

The BPMN spec includes a section which attempts to provide a mapping
between BPMN and BPEL, but the authors state that there are areas of
incompatibility , so it is not a perfect mapping.

Eventually I would like to make sense of all this, but for now I want
to focus on BPMN, and ignore BPEL. I can use wxPython to design a BPMN
diagram, but I have to invent my own method of serialising it so that
I can use it to drive the business process. For good or ill, I decided
to use xml, as it seems to offer the best chance of keeping up with
the various specifications as they evolve.

I don't know if this is of any interest to anyone, but it was
therapeutic for me to try to organise my thoughts and get them down on
paper. I am not expecting any comments, but if anyone has any thoughts
to toss in, I will read them with interest.

Thanks

Frank
Sep 20 '08 #4
On 20 Set, 07:59, Frank Millman <fr...@chagford .comwrote:
I want to introduce an element of workflow management (aka Business
Process Management) into the business/accounting system I am
developing. I used google to try to find out what the current state of
the art is. After several months of very confusing research, this is
the present situation, as best as I can figure it out.
What is the state of the art of existing, working software? Can you
leverage it instead of starting from scratch? For example, the
existing functionality of your accounting software can be reorganized
as a suite of components, web services etc. that can be embedded in
workflow definitions, and/or executing a workflow engine can become a
command in your application.
There is an OMG spec called BPMN, for Business Process Modeling
Notation. It provides a graphical notation
[snip]
there is no standard way
of exchanging a diagram between different vendors, or of using it as
input to a workflow engine.
So BPMN is mere theory. This "spec" might be a reference for
evaluating actual systems, but not a standard itself.
There is an OASIS spec called WS-BPEL, for Web Services Business
Process Execution Language. It defines a language for specifying
business process behavior based on Web Services. This does have a
formal xml-based specification. However, it only covers processes
invoked via web services - it does not cover workflow-type processes
within an organisation. To try to fill this gap, a few vendors got
together and submitted a draft specification called BPEL4People. This
proposes a series of extensions to the WS-BPEL spec. It is still at
the evaluation stage.
Some customers pay good money for buzzword compliance, but are you
sure you want to be so bleeding edge that you care not only for WS-
something specifications, but for "evaluation stage" ones?

There is no need to wait for BPEL4People before designing workflow
systems with human editing, approval, etc.
Try looking into case studies of how BPEL is actually used in
practice.
The BPMN spec includes a section which attempts to provide a mapping
between BPMN and BPEL, but the authors state that there are areas of
incompatibility , so it is not a perfect mapping.
Don't worry, BPMN does not exist: there is no incompatibility .
On the other hand, comparing and understanding BPMN and BPEL might
reveal different purposes and weaknesses between the two systems and
help you distinguish what you need, what would be cool and what is
only a bad idea or a speculation.
Eventually I would like to make sense of all this, but for now I want
to focus on BPMN, and ignore BPEL. I can use wxPython to design a BPMN
diagram, but I have to invent my own method of serialising it so that
I can use it to drive the business process. For good or ill, I decided
to use xml, as it seems to offer the best chance of keeping up with
the various specifications as they evolve.
If you mean to use workflow architectures to add value to your
business and accounting software, your priority should be executing
workflows, not editing workflow diagrams (which are a useful but
unnecessary user interface layer over the actual workflow engine);
making your diagrams and definitions compliant with volatile and
unproven specifications should come a distant last.
I don't know if this is of any interest to anyone, but it was
therapeutic for me to try to organise my thoughts and get them down on
paper. I am not expecting any comments, but if anyone has any thoughts
to toss in, I will read them with interest.

1) There are a number of open-source or affordable workflow engines,
mostly BPEL-compliant and written in Java; they should be more useful
than reinventing the wheel.

2) With a good XML editor you can produce the workflow definitions,
BPEL or otherwise, that your workflow engine needs, and leave the
interactive diagram editor for a phase 2 that might not necessarily
come; text editing might be convenient enough for your users, and for
graphical output something simpler than an editor (e.g a Graphviz
exporter) might be enough.

3) Maybe workflow processing can grow inside your existing accounting
application without the sort of "big bang" redesign you seem to be
planning; chances are that the needed objects are already in place and
you only need to make workflow more explicit and add appropriate new
features.

Regards,
Lorenzo Gatti
Sep 20 '08 #5
Sorry for pressing the send button too fast.

On 20 Set, 07:59, Frank Millman <fr...@chagford .comwrote:
I want to introduce an element of workflow management (aka Business
Process Management) into the business/accounting system I am
developing. I used google to try to find out what the current state of
the art is. After several months of very confusing research, this is
the present situation, as best as I can figure it out.
What is the state of the art of existing, working software? Can you
leverage it instead of starting from scratch? For example, the
existing functionality of your accounting software can be reorganized
as a suite of components, web services etc. that can be embedded in
workflow definitions, and/or executing a workflow engine can become a
command in your application.
There is an OMG spec called BPMN, for Business Process Modeling
Notation. It provides a graphical notation
[snip]
there is no standard way
of exchanging a diagram between different vendors, or of using it as
input to a workflow engine.
So BPMN is mere theory. This "spec" might be a reference for
evaluating actual systems, but not a standard itself.
There is an OASIS spec called WS-BPEL, for Web Services Business
Process Execution Language. It defines a language for specifying
business process behavior based on Web Services. This does have a
formal xml-based specification. However, it only covers processes
invoked via web services - it does not cover workflow-type processes
within an organisation. To try to fill this gap, a few vendors got
together and submitted a draft specification called BPEL4People. This
proposes a series of extensions to the WS-BPEL spec. It is still at
the evaluation stage.
Some customers pay good money for buzzword compliance, but are you
sure you want to be so bleeding edge that you care not only for WS-
something specifications, but for "evaluation stage" ones?

There is no need to wait for BPEL4People before designing workflow
systems with human editing, approval, etc.
Try looking into case studies of how BPEL is actually used in
practice.
The BPMN spec includes a section which attempts to provide a mapping
between BPMN and BPEL, but the authors state that there are areas of
incompatibility , so it is not a perfect mapping.
Don't worry, BPMN does not exist: there is no incompatibility .
On the other hand, comparing and understanding BPMN and BPEL might
reveal different purposes and weaknesses between the two systems and
help you distinguish what you need, what would be cool and what is
only a bad idea or a speculation.
Eventually I would like to make sense of all this, but for now I want
to focus on BPMN, and ignore BPEL. I can use wxPython to design a BPMN
diagram, but I have to invent my own method of serialising it so that
I can use it to drive the business process. For good or ill, I decided
to use xml, as it seems to offer the best chance of keeping up with
the various specifications as they evolve.
If you mean to use workflow architectures to add value to your
business and accounting software, your priority should be executing
workflows, not editing workflow diagrams (which are a useful but
unnecessary user interface layer over the actual workflow engine);
making your diagrams and definitions compliant with volatile and
unproven specifications should come a distant last.
I don't know if this is of any interest to anyone, but it was
therapeutic for me to try to organise my thoughts and get them down on
paper. I am not expecting any comments, but if anyone has any thoughts
to toss in, I will read them with interest.

1) There are a number of open-source or affordable workflow engines,
mostly BPEL-compliant and written in Java; they should be more useful
than reinventing the wheel.

2) With a good XML editor you can produce the workflow definitions,
BPEL or otherwise, that your workflow engine needs, and leave the
interactive diagram editor for a phase 2 that might not necessarily
come; text editing might be convenient enough for your users, and for
graphical output something simpler than an editor (e.g a Graphviz
exporter) might be enough.

3) Maybe workflow processing can grow inside your existing accounting
application without the sort of "big bang" redesign you seem to be
planning; chances are that the needed objects are already in place and
you only need to make workflow more explicit and add appropriate new
features.

Regards,
Lorenzo Gatti
Sep 20 '08 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
2340
by: Robert J Egan | last post by:
Hi i'm trying to search a remote website page. The form returns xml information, though the page extension is missing. I retrieve the information and write it to the screen. So far so good - However i cannot format this information in anyway. A copy of the returned information saved to my server results in the xml data being formatted and...
0
1616
by: Steve Whitlatch | last post by:
It may be me, or it may be the Linux implementation of XML Catalogs on slackware. Whichever, please shed some light on this XML Catalog problem. When using the --catalogs option, xmllint resolves all system entities to local copies. No problem, for example: ********** %:~/docbook-testdocs-1.1/tests> xmllint --noout --nonet --valid...
0
1503
by: melledge | last post by:
The Reliable Source for Everything XML - XML 2005 Update XML 2005 - November 14-18 - Atlanta Hilton Hotel - Atlanta, GA www.xmlconference.org Register today and participate in IDEAlliance's XML 2005 Conference, the most respected educational gathering of technologists, novices, experienced implementers and users, consultants, and...
1
3636
by: Srini | last post by:
I have written two simple webservice functions and trying to consume them through a client piece. Both the webservice functions have similar signature. -------------------------------------------------------- public string quoteNew(System.Xml.XmlNode passedXML) and public string EmitXml(System.Xml.XmlNode passedXML)
3
23943
by: Jonathan Buckland | last post by:
Can someone give me an example how to append data without having to load the complete XML file. Is this possible? Jonathan
5
4129
by: OJO | last post by:
Hello microsoft.public.dotnet.xml! I need to parse some 'jabber xml' (www.jabber.org). I opted for using System.Xml.XmxDocument. The sample 'jabber xml' goes here: <message xmlns='jabber:client' from='chat@server.com/user1' xml:lang='pl' type='groupchat' to='user2@server.com/res'> <body xmlns:xml='http://www.w3.org/XML/1998/namespace'>...
7
3378
by: Michael | last post by:
Hi, I have a problem parsing XML file using XSLT stylesheet by using : using System.Xml; using System.Xml.XPath; using System.Xml.Xsl; // load Xsl stylesheet XslTransform myXslTrans = new XslTransform() ;
5
4197
by: laks | last post by:
Hi I have the following xsl stmt. <xsl:for-each select="JOB_POSTINGS/JOB_POSTING \"> <xsl:sort select="JOB_TITLE" order="ascending"/> This works fine when I use it. But when using multiple values in the where clause as below
0
2782
by: jts2077 | last post by:
I am trying to create a large nested XML object using E4X methods. The problem is the, the XML I am trying to create can only have xmlns set at the top 2 element levels. Such as: <store xmlns="http://www.store.com/xml/1.1.0.0/impex/catalog"> <product sku="10050-1653" xmlns="http://www.store.com/xml/1.1.0.0/impex/catalog"> ...
10
15558
by: =?Utf-8?B?YzY3NjIyOA==?= | last post by:
Hi all, I had a program and it always works fine and suddenly it gives me the following message when a pass a xml file to our server program: error code: -1072896680 reason: XML document must have a top level element. line #: 0 I don't know if it is my xml file or it is something else? Here is my client side program: <%@...
0
7728
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
8056
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
1
7819
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
8099
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
0
6455
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
1
5623
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
5301
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3752
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
1060
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.