By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
449,007 Members | 1,066 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 449,007 IT Pros & Developers. It's quick & easy.

XPath data model and the XML Declaration

P: n/a
Hi

Compontents of an xml file are mapped to the different node types of the
xpath data model.
An element is mapped to an element node, an attribute node represents an
attribute and its value ... and so on.

so far so good

But what happens to the xml declaration (<?xml version="1.0" ......?>).
Though it looks like a processing instruction it isn't.

The XPath specification [1] only says that the XML Declaration does not map
to a the node type processing instruction node (5.5).

Does the XML Declaration map to a Xpath node type or not.
If yes - what type of node?

thx

Helmut
[1] XPath Specification http://www.w3c.org/TR/xpath
Jul 20 '05 #1
Share this Question
Share on Google+
4 Replies


P: n/a

Does the XML Declaration map to a Xpath node type or not.
If yes - what type of node?

No, it affects the parser (eg telling it what encoding has been used),
and so affects the construction of the node tree, but no record of the
declaration is left in the node set.

That's (normally) what you want. If you have a file in latin1 that
starts
<?xml version="1.0" encoding="iso-8859-1"?>
...

and the same data encoded in utf-8 that starts

<?xml version="1.0" encoding="utf-8"?>

Then you want these two to be equivalent. The actual data in the nodes
are just logical unicode characters, the encoding used to store them
originally in a file shouldn't be relevant to the logical view of the
XML tree (and isn't available to XPath/XSLT).

Of course in many cases it's not unrreasonable to want to output a file
with the same encoding as was used for input, but that information has
gone, along with information about whether " or ' was used for
attributes, or where CDATA sections were, etc.

David
Jul 20 '05 #2

P: n/a
* Helmut Dirtinger wrote in comp.text.xml:
Does the XML Declaration map to a Xpath node type or not.


No, it does not. The XML declaration does not contain information that
would make sense to be available in the data model.
--
Björn Höhrmann · mailto:bj****@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
Jul 20 '05 #3

P: n/a
In article <41****************@news.bjoern.hoehrmann.de>,
Bjoern Hoehrmann <bj****@hoehrmann.de> wrote:
Does the XML Declaration map to a Xpath node type or not.
No, it does not.


True.
The XML declaration does not contain information that
would make sense to be available in the data model.


But this is an exaggeration. The encoding is a useful hint when
serializing the data model, and the XML version number affects what
assumptions you can make about the data (e.g. whether it might have
C0 control characters in it).

-- Richard

Jul 20 '05 #4

P: n/a
* Richard Tobin wrote in comp.text.xml:
The XML declaration does not contain information that
would make sense to be available in the data model.


But this is an exaggeration. The encoding is a useful hint when
serializing the data model, and the XML version number affects what
assumptions you can make about the data (e.g. whether it might have
C0 control characters in it).


The problem is however that the value of the encoding pseudo-attribute
might not be the actual character encoding of the document, for example
if the document is delivered via HTTP the acutal encoding might be
specified in the HTTP header which would then override the value in the
XML declaration (if any). So it would make more sense to use other means
to get that information, for example using DOM Level 3 Core methods.

Regarding C0 control characters, I do not think these may appear in the
XPath 1.0 data model which is only defined for XML 1.0 (they are not
allowed to appear in expressions either, strings are restricted to chars
as defined in XML 1.0 which excludes C0 controls)... and if they are in
the data model, you can infer that the XML version is 1.1 so this is of
limited use, too.

Other uses would be possible too, for example, if you want to write an
XSLT document that discovers meta-data from XML documents like

XML Declaration: yes
Encoding Declaration: ISO-8859-1
Standalone: no

Elements:

1 <html>
1 <head>
1 <body>
32 <div>
...

or whatever, so maybe I should rephrase to, the value such functionality
would add is not considered worth the additional complication and such
functionality might contribute to making false assumptions such as that
the encoding in the XML declaration is the actual document encoding.
--
Björn Höhrmann · mailto:bj****@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
Jul 20 '05 #5

This discussion thread is closed

Replies have been disabled for this discussion.