473,394 Members | 1,694 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,394 software developers and data experts.

Whitespace in Canonicalized XML

If I understand correctly, canonicalized XML is a simplified, or
rather, "standardized" form of XML. It is in such a form such that
two documents that are written in different ways, but contain the same
information, will normalize towards one form. This standard form can
then be used as the basis for encryption or digital verification (such
as XML Digital Signature).

If this is the case, then why is whitespace outside of any tags still
preserved? (See Example 3.2 of the W3C Canonical XML Recommendation)

Isn't that whitespace only useful for formatting purposes (ie. so that
it will look pretty on your text viewer)? Or am I missing something
important?

Thank you for your reply...
Jul 20 '05 #1
3 2044
"Celedor" <Ce*****@tekken.cc> wrote...
If this is the case, then why is whitespace outside of any tags still
preserved? (See Example 3.2 of the W3C Canonical XML Recommendation)
Isn't that whitespace only useful for formatting purposes (ie. so that
it will look pretty on your text viewer)? Or am I missing something
important?


Anything that affects how the image will appear is obviously part of
the information.
Jul 20 '05 #2

"Celedor" <Ce*****@tekken.cc> wrote in message
news:4e*************************@posting.google.co m...
If I understand correctly, canonicalized XML is a simplified, or
rather, "standardized" form of XML. It is in such a form such that
two documents that are written in different ways, but contain the same
information, will normalize towards one form. This standard form can
then be used as the basis for encryption or digital verification (such
as XML Digital Signature).

If this is the case, then why is whitespace outside of any tags still
preserved? (See Example 3.2 of the W3C Canonical XML Recommendation)

Hi,

The characteristics and properties of a "presentation" depend very much
on who / what the intended recipient is. In the case of XML, by design,
humans are not the only possible recipients. XML is intended to also convery
data to machines, and these machines should be capable to processing XML
without any ambiguity messing up the works. To accomplish this, XML has
defined a very simple rule : anything in "tags" is XML markup, and
everything else is data.

If you look at the XML spec, you can see that there are different XML
node types defined. One of them is the text node. Consider the example below
:

<a>This is a text node
<ThisIsAnElementNode x="this is an attribute node">This is also a text
node</ThisIsAnElementNode></a>

This is perfectly valid XML. There are no assumptions that you can make
in general about the content of the text nodes. They may be completely
whitespace, or not, and only the recieving application / entity can tell you
if the whitespace is significant. When writing a spec, obviously, the
general case is what needs to be catered to, and hence, pure whitespace text
nodes cannot be "normalized" away.

That being said, the "xml:space" attribute exists to help normalization
of pure whitespace nodes. When the XML / higher-level application processor
(example XSL processor) encounters xml:space, it may or may not normalize -
it depends on the application.

Regards,
Kenneth
Jul 20 '05 #3
Celedor wrote:
If I understand correctly, canonicalized XML is a simplified, or
rather, "standardized" form of XML. It is in such a form such that
two documents that are written in different ways, but contain the same
information, will normalize towards one form. This standard form can
then be used as the basis for encryption or digital verification (such
as XML Digital Signature).

If this is the case, then why is whitespace outside of any tags still
preserved? (See Example 3.2 of the W3C Canonical XML Recommendation)

Isn't that whitespace only useful for formatting purposes (ie. so that
it will look pretty on your text viewer)? Or am I missing something
important?


Only if you have a DTD or Schema that tells you where PCDATA is allowed.

Without one, you have to assume character data can occur anywhere, which
makes *all* white-space significant.

///Peter

Jul 20 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Wolfgang Jeltsch | last post by:
Hello, it is often convenient to insert whitespace into an XML document in order to format it nicely. For example, take this snippet of a notional DocBook XML document: <para> This is a...
2
by: Carlitos | last post by:
Hi there, A class in Xerces J-API (Java) called TextImpl contains a property that returns whether the text is ignorable whitespace...
2
by: Carlitos | last post by:
Hi there, A class in Xerces J-API (Java) called TextImpl contains a property that returns whether the text is ignorable whitespace...
0
by: Shan Plourde | last post by:
Hi everyone, I have been using various regular expressions with the ASP.NET RegularExpressionValidator for quite some time. In general it works very well. One of the common regex's that I use...
3
by: David Pratt | last post by:
Hi. I am splitting a string on a non whitespace character. One or more whitespace characters can be returned as items in the list. I do not want the items in the list that are only whitespace (can...
56
by: infidel | last post by:
Where are they-who-hate-us-for-our-whitespace? Are "they" really that stupid/petty? Are "they" really out there at all? "They" almost sound like a mythical caste of tasteless heathens that "we"...
9
by: amattie | last post by:
Does anyone have any idea on how I can strip the extra whitespace in the XML that shows up when I receive a response from an ASP.NET 2.0 webservice? This has been discussed before, but no one has...
5
by: John Gordon | last post by:
My XSLT files have many occurrences of this general pattern: <a> <xsl:attribute name="href"> <xsl:value-of select="xyz" /> </xsl:attribute> </a> When I execute an XSL transform, the...
13
by: Chaim Krause | last post by:
I am unable to figure out why the first two statements work as I expect them to and the next two do not. Namely, the first two spit the sentence into its component words, while the latter two...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.