Hello,
it is often convenient to insert whitespace into an XML document in order to
format it nicely. For example, take this snippet of a notional DocBook XML
document:
<para>
This is a longer paragraph.
With <wordasword>longer</wordasword> I mean that it contains more than
one sentence.
</para>
I want the whitespace in this snippet of code to be handled as follows:
(1) The whitespace between "<para>" and "This" as well as the whitespace
between "sentence." and "</para>" shall be discarded.
(2) Each other sequence of adjacent whitespace characters shall be
transformed into a single space character.
But how do XML processors and applications deal with this issue?
In section 2.10 of "Extensible Markup Language (XML) 1.0 (Third Edition)",
one can read:
In editing XML documents, it is often convenient to use "white
space" (spaces, tabs, and blank lines) to set apart the markup for
greater readability. Such white space is typically not intended for
inclusion in the delivered version of the document.
But who decides which whitespace shall be considered as whitespace that is
just used to set apart the markup? And is whitespace just used to indent
lines of text also not intended for inclusion in the delivered version?
What is this "delivered version" of the document?
I'd be thankful for any clarification.
Best whishes,
Wolfgang