By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
445,645 Members | 1,078 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 445,645 IT Pros & Developers. It's quick & easy.

XML and namespaces

P: n/a
I'm having some issues around namespace handling with XML:
document = xml.dom.minidom.Document()
element = document.createElementNS("DAV:", "href")
document.appendChild(element) <DOM Element: href at 0x1443e68> document.toxml() '<?xml version="1.0" ?>\n<href/>'

Note that the namespace wasn't emitted. If I have PyXML,
xml.dom.ext.Print does emit the namespace:
xml.dom.ext.Print(document)

<?xml version='1.0' encoding='UTF-8'?><href xmlns='DAV:'/>

Is that a limitation in toxml(), or is there an option to make it
include namespaces?

-wsv

Nov 22 '05 #1
Share this Question
Share on Google+
36 Replies


P: n/a
Wilfredo Sánchez Vega:
"""
I'm having some issues around namespace handling with XML:
document = xml.dom.minidom.Document()
element = document.createElementNS("DAV:", "href")
document.appendChild(element) <DOM Element: href at 0x1443e68> document.toxml()

'<?xml version="1.0" ?>\n<href/>'
"""

I haven't worked with minidom in just about forever, but from what I
can tell this is a serious bug (or at least an appalling mising
feature). I can't find anything in the Element,writexml() method that
deals with namespaces. But I'm just baffled. Is there really any way
such a bug could have gone so long unnoticed in Python and PyXML? I
searched both trackers, and the closest thing I could find was this
from 2002:

http://sourceforge.net/tracker/index...73&atid=106473

Different symptom, but also looks like a case of namespace ignorant
code.

Can anyone who's worked on minidom more recently let me know if I'm
just blind to something?

--
Uche Ogbuji Fourthought, Inc.
http://uche.ogbuji.net http://fourthought.com
http://copia.ogbuji.net http://4Suite.org
Articles: http://uche.ogbuji.net/tech/publications/

Nov 30 '05 #2

P: n/a
I've found the same bug. This is what I've been doing:

from xml.dom.minidom import Document

try:
from xml.dom.ext import PrettyPrint
except ImportError:
PrettyPrint = None

doc = Document()
...

if PrettyPrint is not None:
PrettyPrint(doc, stream=output, indent=' ')
else:
top_parent.setAttribute("xmlns", xmlns)
output.write(doc.toprettyxml(indent=' '))

Nov 30 '05 #3

P: n/a
On 30 Nov 2005 07:22:56 -0800,
uc*********@gmail.com <uc*********@gmail.com> quoted:
>>> element = document.createElementNS("DAV:", "href")


This call is incorrect; the signature is createElementNS(namespaceURI,
qualifiedName). If you call .createElementNS('whatever', 'DAV:href'),
the output is the expected:
<?xml version="1.0" ?><DAV:href/>

It doesn't look like there's any code in minidom that will
automatically create an 'xmlns:DAV="whatever"' attribute for you. Is
this automatic creation an expected behaviour?

(I assume not. Section 1.3.3 of the DOM Level 3 says "Similarly,
creating a node with a namespace prefix and namespace URI, or changing
the namespace prefix of a node, does not result in any addition,
removal, or modification of any special attributes for declaring the
appropriate XML namespaces." So the DOM can create XML documents that
aren't well-formed w.r.t. namespaces, I think.)

--amk
Nov 30 '05 #4

P: n/a
Quoting Andrew Kuchling:
"""
>>> element = document.createElementNS("DAV:", "href")
This call is incorrect; the signature is createElementNS(namespaceURI,
qualifiedName).
"""

Not at all, Andrew. "href" is a valid qname, as is "foo:href". The
prefix is optional in a QName. Here is the correct behavior, taken
from a non-broken DOM library (4Suite's Domlette)
from Ft.Xml import Domlette
document = Domlette.implementation.createDocument(None, None, None)
element = document.createElementNS("DAV:", "href")
document.appendChild(element) <Element at 0xb7d12e2c: name u'href', 0 attributes, 0 children> Domlette.Print(document)

<?xml version="1.0" encoding="UTF-8"?>
<href xmlns="DAV:"/>>>>

"""
If you call .createElementNS('whatever', 'DAV:href'),
the output is the expected:
<?xml version="1.0" ?><DAV:href/>
"""

Oh, no. That is not at all expected. The output should be:

<?xml version="1.0" ?><DAV:href xmlns:DAV="whatever"/>

"""
It doesn't look like there's any code in minidom that will
automatically create an 'xmlns:DAV="whatever"' attribute for you. Is
this automatic creation an expected behaviour?
"""

Of course. Minidom implements level 2 (thus the "NS" at the end of the
method name), which means that its APIs should all be namespace aware.
The bug is that writexml() and thus toxml() are not so.

"""
(I assume not. Section 1.3.3 of the DOM Level 3 says "Similarly,
creating a node with a namespace prefix and namespace URI, or changing
the namespace prefix of a node, does not result in any addition,
removal, or modification of any special attributes for declaring the
appropriate XML namespaces." So the DOM can create XML documents that
aren't well-formed w.r.t. namespaces, I think.)
"""

Oh no. That only means that namespace declaration attributes are not
created in the DOM data structure. However, output has to fix up
namespaces in .namespaceURI properties as well as directly asserted
"xmlns" attributes. It would be silly for DOM to produce malformed
XML+XMLNS, and of course it is not meant to. The minidom behavior
needs fixing, badly.

--
Uche Ogbuji Fourthought, Inc.
http://uche.ogbuji.net http://fourthought.com
http://copia.ogbuji.net http://4Suite.org
Articles: http://uche.ogbuji.net/tech/publications/

Dec 2 '05 #5

P: n/a
On 2 Dec 2005 06:16:29 -0800,
uc*********@gmail.com <uc*********@gmail.com> wrote:
Of course. Minidom implements level 2 (thus the "NS" at the end of the
method name), which means that its APIs should all be namespace aware.
The bug is that writexml() and thus toxml() are not so.


Hm, OK. Filed as bug #1371937 in the Python bug tracker. Maybe I'll
look at this during the bug day this Sunday.

--amk

Dec 2 '05 #6

P: n/a
[AMK]
"""
(I assume not. Section 1.3.3 of the DOM Level 3 says "Similarly,
creating a node with a namespace prefix and namespace URI, or changing
the namespace prefix of a node, does not result in any addition,
removal, or modification of any special attributes for declaring the
appropriate XML namespaces." So the DOM can create XML documents that
aren't well-formed w.r.t. namespaces, I think.)
"""
[Uche] Oh no. That only means that namespace declaration attributes are not
created in the DOM data structure. However, output has to fix up
namespaces in .namespaceURI properties as well as directly asserted
"xmlns" attributes. It would be silly for DOM to produce malformed
XML+XMLNS, and of course it is not meant to. The minidom behavior
needs fixing, badly.


My interpretation of namespace nodes is that the application is
responsible for creating whatever namespace declaration attribute nodes
are required, on the DOM tree.

DOM should not have to imply any attributes on output.

#-=-=-=-=-=-=-=-=-=
import xml.dom
import xml.dom.minidom

DAV_NS_U = "http://webdav.org"

xmldoc = xml.dom.minidom.Document()
xmlroot = xmldoc.createElementNS(DAV_NS_U, "DAV:xpg")
xmlroot.setAttributeNS(xml.dom.XMLNS_NAMESPACE, "xmlns:DAV", DAV_NS_U)
xmldoc.appendChild(xmlroot)
print xmldoc.toprettyxml()
#-=-=-=-=-=-=-=-=-=

produces

"""
<?xml version="1.0" ?>
<DAV:xpg xmlns:DAV="http://webdav.org"/>
"""

Which is well formed wrt namespaces.

regards,

--
alan kennedy
------------------------------------------------------
email alan: http://xhaus.com/contact/alan
Dec 2 '05 #7

P: n/a
Alan Kennedy:
"""
Oh no. That only means that namespace declaration attributes are not
created in the DOM data structure. However, output has to fix up
namespaces in .namespaceURI properties as well as directly asserted
"xmlns" attributes. It would be silly for DOM to produce malformed
XML+XMLNS, and of course it is not meant to. The minidom behavior
needs fixing, badly.


My interpretation of namespace nodes is that the application is
responsible for creating whatever namespace declaration attribute nodes
are required, on the DOM tree.

DOM should not have to imply any attributes on output.
"""

I'm sorry but you're wrong on this. First of all, DOM L2 (the level
minidom targets) does not have the concept of "namespace nodes".
That's XPath. DOM supports two ways of expressing namespace
information. The first way is through the node properties
..namespaceURI, .prefix (for the QName) and .localName. It *also*
supports literal namespace declaration atrributes (the NSDecl
attributes themselves must have a namespace of
"http://www.w3.org/2000/xmlns/"). As if this is not confusing enough
the Level 1 propoerty .nodeName must provide the QName, redundantly.

As a result, you have to perform fix-up to merge properties with
explicit NSDEcl attributes in order to serialize. If it does not do
so, it is losing all the information in namespace properties, and the
resulting output is not the same document that is represented in the
DOM.

Believe me, I've spent many weary hours with all these issues, and
implemented code to deal with the mess multiple times, and I know it
all too painfully well. I wrote Amara largely because I got
irrecoverably sick of DOM's idiosyncracies.

Andrew, for this reason I probably take the initiative to work up a
patch for the issue. I'll do what I can to get to it tomorrow. If you
help me with code review and maybe writing some tests, that would be a
huge help.

--
Uche Ogbuji Fourthought, Inc.
http://uche.ogbuji.net http://fourthought.com
http://copia.ogbuji.net http://4Suite.org
Articles: http://uche.ogbuji.net/tech/publications/

Dec 2 '05 #8

P: n/a
Uche <uc*********@gmail.com> wrote:
Of course. Minidom implements level 2 (thus the "NS" at the end of the
method name), which means that its APIs should all be namespace aware.
The bug is that writexml() and thus toxml() are not so.


Not exactly a bug - DOM Level 2 Core 1.1.8p2 explicitly leaves
namespace fixup at the mercy of the application. It's only standardised
as a DOM feature in Level 3, which minidom does not yet claim to
support. It would be a nice feature to add, but it's not entirely
trivial to implement, especially when you can serialize a partial DOM
tree.

Additionally, it might have some compatibility problems with apps that
don't expect namespace declarations to automagically appear. For
example, perhaps, an app dealing with HTML that doesn't want spare
xmlns="http://www.w3.org/1999/xhtml" declarations appearing in every
snippet of serialized output.

So it should probably be optional. In DOM Level 3 (and pxdom) there's a
DOMConfiguration parameter 'namespaces' to control it; perhaps for
minidom an argument to toxml() might be best?

--
And Clover
mailto:an*@doxdesk.com
http://www.doxdesk.com/

Dec 3 '05 #9

P: n/a
[uc*********@gmail.com]
Oh no. That only means that namespace declaration attributes are not
created in the DOM data structure. However, output has to fix up
namespaces in .namespaceURI properties as well as directly asserted
"xmlns" attributes. It would be silly for DOM to produce malformed
XML+XMLNS, and of course it is not meant to. The minidom behavior
needs fixing, badly.

[Alan Kennedy] My interpretation of namespace nodes is that the application is
responsible for creating whatever namespace declaration attribute nodes
are required, on the DOM tree.

DOM should not have to imply any attributes on output.
[uc*********@gmail.com] ..... you have to perform fix-up to merge properties with
explicit NSDEcl attributes in order to serialize. If it does not do
so, it is losing all the information in namespace properties, and the
resulting output is not the same document that is represented in the
DOM.


Well, my reading of the DOM L2 spec is such that it does not agree with
the statement above.

http://www.w3.org/TR/2000/REC-DOM-Le...1113/core.html

Section 1.1.8: XML Namespaces

"""
Namespace validation is not enforced; the DOM application is
responsible. In particular, since the mapping between prefixes and
namespace URIs is not enforced, in general, the resulting document
cannot be serialized naively. For example, applications may have to
declare every namespace in use when serializing a document.
"""

To me, this means that, as AMK originally stated, that DOM L2 is capable
of creating documents that are not well-formed wrt to namespaces, i.e.
"cannot be serialized naively". It is the application authors
responsibility to ensure that their document is well formed.

Also, there is the following important note

"""
Note: In the DOM, all namespace declaration attributes are by definition
bound to the namespace URI: "http://www.w3.org/2000/xmlns/". These are
the attributes whose namespace prefix or qualified name is "xmlns".
"""

These namespace declaration nodes, i.e. attribute nodes in the
xml.dom.XMLNS_NAMESPACE namespace, are a pre-requisite for any
namespaced DOM document to be well-formed, and thus naively serializable.

The argument could be made that application authors should be protected
from themselves by having the underlying DOM library automatically
create the relevant namespace nodes.

But to me that's not pythonic: it's implicit, not explicit.

My vote is that the existing xml.dom.minidom behaviour wrt namespace
nodes is correct and should not be changed.

regards,

--
alan kennedy
------------------------------------------------------
email alan: http://xhaus.com/contact/alan
Dec 3 '05 #10

P: n/a
Alan Kennedy:
"""
These namespace declaration nodes, i.e. attribute nodes in the
xml.dom.XMLNS_NAMESPACE namespace, are a pre-requisite for any
namespaced DOM document to be well-formed, and thus naively
serializable.

The argument could be made that application authors should be protected
from themselves by having the underlying DOM library automatically
create the relevant namespace nodes.

But to me that's not pythonic: it's implicit, not explicit.

My vote is that the existing xml.dom.minidom behaviour wrt namespace
nodes is correct and should not be changed.
"""

Andrew Clover also suggested an overly-legalistic argument that current
minidom behavior is not a bug.

It's a very strange attitude that because a behavior is not
specifically proscribed in a spec, that it is not a bug. Let me try a
reducto ad absurdum, which I think in this case is a very fair
stratagem. If the code in question:
document = xml.dom.minidom.Document()
element = document.createElementNS("DAV:", "href")
document.appendChild(element) <DOM Element: href at 0x1443e68> document.toxml()

'<?xml version="1.0" ?>\n<ferh/>'

(i.e. "ferh" rather than "href"), would you not consider that a minidom
bug?

Now consider that DOM Level 2 does not proscribe such mangling.

Do you still think that's a useful way to determine what is a bug?

The current, erroneous behavior, which you advocate, is of the same
bug. Minidom is an XML Namespaces aware API. In XML Namespaces, the
namespace URI is *part of* the name. No question about it. In Clark
notation the element name that is specified in

element = document.createElementNS("DAV:", "href")

is "{DAV:}href". In Clark notation the element name of the document
element in the created docuent is "href". That is not the name the
user specified. It is a mangled version of it. The mangling is no
better than my reductio of reversing the qname. This is a bug. Simple
as that. WIth this behavior, minidom is an API correct with respect to
XML Namespaces.

So you try the tack of invoking "pythonicness". Well I have one for
ya:

"In the face of ambiguity, refuse the temptation to guess."

You re guessing that explicit XMLNS attributes are the only way the
user means to express namespace information, even though DOM allows
this to be provided through such attributes *or* through namespace
properties. I could easily argue that since these are core properties
in the DOM, that DOM should ignore explicit XMLNS attributes and only
use namespace properties in determining output namespace. You are
guessing that XMLNS attributes (and only those) represent what the user
really means. I would be arguing the same of namespace properties.

The reality is that once the poor user has done:

element = document.createElementNS("DAV:", "href")

They are following DOM specification that they have created an element
in a namespace, and you seem to be arguing that they cannot usefully
have completed their work until they also do:

element.setAttributeNS(xml.dom.XMLNS_NAMESPACE, None, "DAV:")

I'd love to hear how many actual minidom users would agree with you.

It's currently a bug. It needs to be fixed. However, I have no time
for this bewildering fight. If the consensus is to leave minidom the
way it is, I'll just wash my hands of the matter, but I'll be sure to
emphasize heavily to users that minidom is broken with respect to
Namespaces and serialization, and that they abandon it in favor of
third-party tools.

--
Uche Ogbuji Fourthought, Inc.
http://uche.ogbuji.net http://fourthought.com
http://copia.ogbuji.net http://4Suite.org
Articles: http://uche.ogbuji.net/tech/publications/

Dec 5 '05 #11

P: n/a
I wrote:
"""
The reality is that once the poor user has done:

element = document.createElementNS("DAV:", "href")

They are following DOM specification that they have created an element
in a namespace, and you seem to be arguing that they cannot usefully
have completed their work until they also do:

element.setAttributeNS(xml.dom.XMLNS_NAMESPACE, None, "DAV:")

I'd love to hear how many actual minidom users would agree with you.
"""

Of course (FWIW) I meant

element.setAttributeNS(xml.dom.XMLNS_NAMESPACE, "xmlns", "DAV:")

--
Uche Ogbuji Fourthought, Inc.
http://uche.ogbuji.net http://fourthought.com
http://copia.ogbuji.net http://4Suite.org
Articles: http://uche.ogbuji.net/tech/publications/

Dec 5 '05 #12

P: n/a
[uc*********@gmail.com]
The current, erroneous behavior, which you advocate, is of the same
bug. Minidom is an XML Namespaces aware API. In XML Namespaces, the
namespace URI is *part of* the name. No question about it. In Clark
notation the element name that is specified in

element = document.createElementNS("DAV:", "href")

is "{DAV:}href". In Clark notation the element name of the document
element in the created docuent is "href".
I think if we're going to get anywhere in this discussion, we'll have to
stick to the convention that we are dealing with some specific values. I
suggest the following

element_local_name = 'href'
element_ns_prefix = 'DAV'
element_ns_uri = 'somescheme://someuri'

Therefore, in Clark notation, the qualified name of the element in the
OPs example is "{somescheme://someuri}href". (Yes, I know that "DAV:" is
a valid namespace URI. But it's a poor example because it looks like a
namespace prefix, and may be giving rise to some confusion.)

So, to create a namespaced element, we must specify the namespace uri,
the namespace prefix and the element local name, like so

qname = "%s:%s" % (element_ns_prefix, element_local_name)
element = document.createElementNS(element_ns_uri, qname)

Now, if we create, as the OP did, an element with a namespace uri but no
prefix, like so

element = document.createElementNS(element_ns_uri, element_local_name)

that element *cannot* be serialised naively, because the namespace
prefix has not been declared. Yes, the element is correctly scoped to
the element_ns_uri namespace, but it cannot be serialised because
declaration of namespace prefixes is a pre-requisite of the Namespaces
REC. Relevant quotes from the Namespaces REC are

"""
URI references can contain characters not allowed in names, so cannot be
used directly as namespace prefixes. Therefore, the namespace prefix
serves as a proxy for a URI reference. An attribute-based syntax
described below is used to declare the association of the namespace
prefix with a URI reference; software which supports this namespace
proposal must recognize and act on these declarations and prefixes.
"""

and

"""
Namespace Constraint: Prefix Declared
The namespace prefix, unless it is xml or xmlns, must have been declared
in a namespace declaration attribute in either the start-tag of the
element where the prefix is used or in an an ancestor element (i.e. an
element in whose content the prefixed markup occurs).
"""

http://www.w3.org/TR/REC-xml-names/

[uc*********@gmail.com] So you try the tack of invoking "pythonicness". Well I have one for
ya:

"In the face of ambiguity, refuse the temptation to guess."
Precisely: If the user has created a document that is not namespace
correct, then do not try to guess whether it should be corrected or not:
simply serialize the dud document. If the user wants a namespace
well-formed document, then they are responsible for either ensuring that
the relevant namespaces, prefixes and uris are explicitly declared, or
for explicitly calling some normalization routine that automagically
does that for them.

[uc*********@gmail.com] You re guessing that explicit XMLNS attributes are the only way the
user means to express namespace information, even though DOM allows
this to be provided through such attributes *or* through namespace
properties. I could easily argue that since these are core properties
in the DOM, that DOM should ignore explicit XMLNS attributes and only
use namespace properties in determining output namespace. You are
guessing that XMLNS attributes (and only those) represent what the
user really means. I would be arguing the same of namespace
properties.
I'm not guessing anything: I'm asserting that with DOM Level 2, the user
is expected to manage their own namespace prefix declarations.

DOM L2 states that "Namespace validation is not enforced; the DOM
application is responsible. In particular, since the mapping between
prefixes and namespace URIs is not enforced, in general, the resulting
document cannot be serialized naively."

DOM L3 provides the normalizeNamespaces method, which the user should
have to *explicitly* call in order to make their document namespace
well-formed if it was not already.

http://www.w3.org/TR/2004/REC-DOM-Le...lgorithms.html

The proposal that minidom should automagically fixup namespace
declarations and prefixes on output would leave it compliant with
*neither* DOM L2 or L3.

[uc*********@gmail.com] The reality is that once the poor user has done:

element = document.createElementNS("DAV:", "href")

They are following DOM specification that they have created an element
in a namespace, and you seem to be arguing that they cannot usefully
have completed their work until they also do:

element.setAttributeNS(xml.dom.XMLNS_NAMESPACE, None, "DAV:")
Actually no, that statement produces "AttributeError: 'NoneType' object
has no attribute 'split'". I believe that you're confusing "DAV:" as a
namespace uri with "DAV" as a namespace prefix.

Code for creating the correct prefix declaration is

prefix_decl = "xmlns:%s" % element_ns_prefix
element.setAttributeNS(xml.dom.XMLNS_NAMESPACE, prefix_decl, element_ns_uri)
I'd love to hear how many actual minidom users would agree with you.

It's currently a bug. It needs to be fixed. However, I have no time
for this bewildering fight. If the consensus is to leave minidom the
way it is, I'll just wash my hands of the matter, but I'll be sure to
emphasize heavily to users that minidom is broken with respect to
Namespaces and serialization, and that they abandon it in favor of
third-party tools.


It's not a bug, it doesn't need fixing, minidom is not broken.

Although I am sympathetic to your bewilderment: xml namespaces can be
overly complex when it comes to the nitty, gritty details.

--
alan kennedy
------------------------------------------------------
email alan: http://xhaus.com/contact/alan
Dec 5 '05 #13

P: n/a
> Is this automatic creation an expected behaviour?
Of course. Not exactly a bug /.../ So it should probably be optional. My interpretation of namespace nodes is that the application is
responsible /.../ I'm sorry but you're wrong on this. Well, my reading of the DOM L2 spec is such that it does not agree with
the statement above. It's currently a bug. It needs to be fixed. It's not a bug, it doesn't need fixing, minidom is not broken.


further p^H^H^H^H^H^H^H^H^H

can anyone perhaps dig up a DOM L2 implementation that's not written
by anyone involved in this thread, and see what it does ?

</F>

Dec 5 '05 #14

P: n/a
Fredrik Lundh wrote:
can anyone perhaps dig up a DOM L2 implementation that's not written
by anyone involved in this thread, and see what it does ?


Alright. Look away from the wrapper code (which I wrote, and which
doesn't do anything particularly clever) and look at the underlying
libxml2 serialisation behaviour:

import libxml2dom
document = libxml2dom.createDocument("DAV:", "href", None)
print document.toString()

This outputs the following:

<?xml version="1.0"?>
<href xmlns="DAV:"/>

To reproduce the creation of bare Document objects (which I thought
wasn't strictly supported by minidom), we perform some tricks:

document = libxml2dom.createDocument(None, "doc", None)
top = document.xpath("*")[0]
element = document.createElementNS("DAV:", "href")
document.replaceChild(element, top)
print document.toString()

This outputs the following:

<?xml version="1.0"?>
<href xmlns="DAV:"/>

While I can understand the desire to suppress xmlns attribute
generation for certain document types, this is probably only
interesting for legacy XML processors and for HTML. Leaving such
attributes out by default, whilst claiming some kind of "fine print"
standards compliance, is really a recipe for unnecessary user
frustration.

Paul

Dec 5 '05 #15

P: n/a
[Fredrik Lundh]
can anyone perhaps dig up a DOM L2 implementation that's not written
by anyone involved in this thread, and see what it does ?

[Paul Boddie] document = libxml2dom.createDocument(None, "doc", None)
top = document.xpath("*")[0]
element = document.createElementNS("DAV:", "href")
document.replaceChild(element, top)
print document.toString()

This outputs the following:

<?xml version="1.0"?>
<href xmlns="DAV:"/>
But that's incorrect. You have now defaulted the namespace to "DAV:" for
every unprefixed element that is a descendant of the href element.

Here is an example

#-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
document = libxml2dom.createDocument(None, "doc", None)
top = document.xpath("*")[0]
elem1 = document.createElementNS("DAV:", "href")
document.replaceChild(elem1, top)
elem2 = document.createElementNS(None, "no_ns")
document.childNodes[0].appendChild(elem2)
print document.toString()
#-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

which produces

"""
<?xml version="1.0"?>
<href xmlns="DAV:"><no_ns/></href>
"""

The defaulting rules of XML namespaces state

"""
5.2 Namespace Defaulting

A default namespace is considered to apply to the element where it is
declared (if that element has no namespace prefix), and to all elements
with no prefix within the content of that element.
"""

http://www.w3.org/TR/REC-xml-names/#defaulting

So although I have explicitly specified no namespace for the no_ns
subelement, it now defaults to the default "DAV:" namespace which has
been declared in the automagically created xmlns attribute. This is
wrong behaviour.

If I want for my sub-element to truly have no namespace, I have to write
it like this

"""
<?xml version="1.0"?>
<myns:href xmlns:myns="DAV:"><no_ns/></myns:href>
"""

[Paul Boddie] Leaving such
attributes out by default, whilst claiming some kind of "fine print"
standards compliance, is really a recipe for unnecessary user
frustration.


On the contrary, once you start second guessing the standards and making
guesses about what users are really trying to do, and making decisions
for them, then some people are going to get different behaviour from
what they rightfully expect according to the standard. People whose
expectations match with the guesses made on their behalf will find that
their software is not portable between DOM implementations.

With something as finicky as XML namespaces, you can't just make ad-hoc
decisions as to what the user "really wants". That's why DOM L2 punted
on the whole problem, and left it to DOM L3.

--
alan kennedy
------------------------------------------------------
email alan: http://xhaus.com/contact/alan
Dec 5 '05 #16

P: n/a
> Leaving such attributes out by default, whilst claiming some kind of
"fine print" standards compliance, is really a recipe for unnecessary user
frustration. On the contrary, once you start second guessing the standards and making
guesses about what users are really trying to do, and making decisions
for them, then some people are going to get different behaviour from
what they rightfully expect according to the standard. People whose
expectations match with the guesses made on their behalf will find that
their software is not portable between DOM implementations.
and this hypothetical situation is different from the current situation in
exactly what way?
With something as finicky as XML namespaces, you can't just make ad-hoc
decisions as to what the user "really wants". That's why DOM L2 punted
on the whole problem, and left it to DOM L3.


so L2 is the "we support namespaces, but we don't really support them"
level ?

maybe we could take everyone involved with the DOM design out to the
backyard and beat them with empty PET bottles until they promise never
to touch a computer again ?

</F>

Dec 5 '05 #17

P: n/a
[Alan Kennedy]
On the contrary, once you start second guessing the standards and making
guesses about what users are really trying to do, and making decisions
for them, then some people are going to get different behaviour from
what they rightfully expect according to the standard. People whose
expectations match with the guesses made on their behalf will find that
their software is not portable between DOM implementations.
[Fredrik Lundh]
and this hypothetical situation is different from the current situation in
exactly what way?
Hmm, not sure I understand what you're getting at.

If changes are made to minidom that implement non-standard behaviour,
there are two groups of people I'm thinking of

1. The people who expect the standard behaviour, not the modified
behaviour. From these people's POV, the software can then be considered
broken, since it produces different results from what is expected
according to the standard.

2. The people who are ignorant of the decisions made on their behalf,
and assume that they have written correct code. But their code won't
work on other DOM implementations (because the automagic namespace fixup
code isn't present, for example). From these people's POV, the software
can then be considered broken.

[Alan Kennedy]With something as finicky as XML namespaces, you can't just make ad-hoc
decisions as to what the user "really wants". That's why DOM L2 punted
on the whole problem, and left it to DOM L3.


[Fredrik Lundh] so L2 is the "we support namespaces, but we don't really support them"
level ?
Well, I read it as "we support namespaces, but only if you know what
you're doing".

[Fredrik Lundh] maybe we could take everyone involved with the DOM design out to the
backyard and beat them with empty PET bottles until they promise never
to touch a computer again ?


:-D

--
alan kennedy
------------------------------------------------------
email alan: http://xhaus.com/contact/alan
Dec 6 '05 #18

P: n/a
Alan Kennedy wrote:
[Fredrik Lundh]
and this hypothetical situation is different from the current situation in
exactly what way?


Hmm, not sure I understand what you're getting at.

If changes are made to minidom that implement non-standard behaviour,
there are two groups of people I'm thinking of

1. The people who expect the standard behaviour, not the modified
behaviour. From these people's POV, the software can then be considered
broken, since it produces different results from what is expected
according to the standard.

2. The people who are ignorant of the decisions made on their behalf,
and assume that they have written correct code. But their code won't
work on other DOM implementations (because the automagic namespace fixup
code isn't present, for example). From these people's POV, the software
can then be considered broken.


my point was that (unless I'm missing something here), there are at least
two widely used implementations (libxml2 and the 4DOM domlette stuff) that
don't interpret the spec in this way.
so L2 is the "we support namespaces, but we don't really support them"
level ?


Well, I read it as "we support namespaces, but only if you know what
you're doing".


or "we support namespaces, but no matter how you interpret the word
'support', we probably mean something else. nyah nyah!"

</F>

Dec 6 '05 #19

P: n/a
Alan Kennedy
"""
Although I am sympathetic to your bewilderment: xml namespaces can be
overly complex when it comes to the nitty, gritty details.
"""

You're the one who doesn't seem to clearly understand XML namespaces.
It's your position that is bewildering, not XML namespaces (well, they
are confusing, but I have a good handle on all the nuances by now).

Again, no skin off my back here: I write and use tools that are XML
namespaces compliant. It doesn't hurt me that Minidom is not. I was
hoping to help, but again I don't have time for ths argument.

--
Uche Ogbuji Fourthought, Inc.
http://uche.ogbuji.net http://fourthought.com
http://copia.ogbuji.net http://4Suite.org
Articles: http://uche.ogbuji.net/tech/publications/

Dec 6 '05 #20

P: n/a
Wilfredo Sánchez Vega:
"""
I'm having some issues around namespace handling with XML:
document = xml.dom.minidom.Document()
element = document.createElementNS("DAV:", "href")
document.appendChild(element) <DOM Element: href at 0x1443e68> document.toxml() '<?xml version="1.0" ?>\n<href/>'

Note that the namespace wasn't emitted. If I have PyXML,
xml.dom.ext.Print does emit the namespace:
xml.dom.ext.Print(document)

<?xml version='1.0' encoding='UTF-8'?><href xmlns='DAV:'/>

Is that a limitation in toxml(), or is there an option to make it
include namespaces?
"""

Getting back to the OP:

PyXML's xml.dom.ext.Print does get things right, and based on
discussion in this thread, the only way you can serialize correctly is
to use that add-on with minidom, or to use a third party, properly
Namespaces-aware tool such as 4Suite (there are others as well).

Good luck.

--
Uche Ogbuji Fourthought, Inc.
http://uche.ogbuji.net http://fourthought.com
http://copia.ogbuji.net http://4Suite.org
Articles: http://uche.ogbuji.net/tech/publications/

Dec 6 '05 #21

P: n/a
[Fredrik Lundh]
my point was that (unless I'm missing something here), there are at
least two widely used implementations (libxml2 and the 4DOM domlette
stuff) that don't interpret the spec in this way.


Libxml2dom is of alpha quality, according to its CheeseShop page anyway.

http://cheeseshop.python.org/pypi/libxml2dom/0.2.4

This can be seen in its incorrect serialisation of the following valid DOM.

#-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
document = libxml2dom.createDocument(None, "doc", None)
top = document.xpath("*")[0]
elem1 = document.createElementNS("DAV:", "myns:href")
elem1.setAttributeNS(xml.dom.XMLNS_NAMESPACE, "xmlns:myns", "DAV:")
document.replaceChild(elem1, top)
print document.toString()
#-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Which produces

"""
<?xml version="1.0"?>
<myns:href
xmlns:myns="DAV:"
xmlns:xmlns="http://www.w3.org/2000/xmlns/"
xmlns:myns="DAV:"
/>
"""

Which is not even well-formed XML (duplicate attributes), let alone
namespace well-formed. Note also the invalid xml namespace "xmlns:xmlns"
attribute. So I don't accept that libxml2dom's behaviour is definitive
in this case.

The other DOM you refer to, the 4DOM stuff, was written by a participant
in this discussion.

Will you accept Apache Xerces 2 for Java as a widely used DOM
Implementation? I guarantee that it is far more widely used than either
of the DOMs mentioned.

Download Xerces 2 (I am using Xerces 2.7.1), and run the following code
under jython:-

http://www.apache.org/dist/xml/xerces-j/

#-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
#
# This is a simple adaptation of the DOMGenerate.java
# sample from the Xerces 2.7.1 distribution.
#
from javax.xml.parsers import DocumentBuilder, DocumentBuilderFactory
from org.apache.xml.serialize import OutputFormat, XMLSerializer
from java.io import StringWriter

def create_document():
dbf = DocumentBuilderFactory.newInstance()
db = dbf.newDocumentBuilder()
return db.newDocument()

def serialise(doc):
format = OutputFormat( doc )
outbuf = StringWriter()
serial = XMLSerializer( outbuf, format )
serial.asDOMSerializer()
serial.serialize(doc.getDocumentElement())
return outbuf.toString()

doc = create_document()
root = doc.createElementNS("DAV:", "href")
doc.appendChild( root )
print serialise(doc)
#-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Which produces

"""
<?xml version="1.0" encoding="UTF-8"?>
<href/>
"""

As I expected it would.

--
alan kennedy
------------------------------------------------------
email alan: http://xhaus.com/contact/alan
Dec 6 '05 #22

P: n/a
[uc*********@gmail.com]
You're the one who doesn't seem to clearly understand XML namespaces.
It's your position that is bewildering, not XML namespaces (well, they
are confusing, but I have a good handle on all the nuances by now).
So you keep claiming, but I have yet to see the evidence.
Again, no skin off my back here: I write and use tools that are XML
namespaces compliant. It doesn't hurt me that Minidom is not. I was
hoping to help, but again I don't have time for ths argument.


If you make statements such as "you're wrong on this ....", "you
misunderstand ....", "you're guessing .....", etc, then you should be
prepared to back them up, not state them and then say "but I'm too busy
and/or important to discuss it with you".

Perhaps you should think twice before making such statements in the future.

--
alan kennedy
------------------------------------------------------
email alan: http://xhaus.com/contact/alan
Dec 6 '05 #23

P: n/a
Alan Kennedy wrote:
[Fredrik Lundh]
my point was that (unless I'm missing something here), there are at
least two widely used implementations (libxml2 and the 4DOM domlette
stuff) that don't interpret the spec in this way.
Libxml2dom is of alpha quality, according to its CheeseShop page anyway.

http://cheeseshop.python.org/pypi/libxml2dom/0.2.4


but isn't libxml2dom just a binding for libxml2? as I mention above, I had libxml2
in mind when I wrote "widely used", not the libxml2dom binding itself.
Will you accept Apache Xerces 2 for Java as a widely used DOM
Implementation?


sure.

but libxml2 is also widely used, so we have at least two ways to interpret the spec.
the defacto interpretation of the spec seems to be that namespace handling during
serialization is "undefined"...

(is there perhaps a DOM library that starts "hack" or "rogue" when you use name-
spaces ? ;-)

</F>

Dec 6 '05 #24

P: n/a
[Fredrik Lundh]
but isn't libxml2dom just a binding for libxml2? as I mention above, I had libxml2
in mind when I wrote "widely used", not the libxml2dom binding itself.
No, libxml2dom is Paul Boddie's DOM API compatibility layer on top of
the cpython bindings for libxml2. From the CheeseShop page

"""
The libxml2dom package provides a traditional DOM wrapper around the
Python bindings for libxml2. In contrast to the libxml2 bindings,
libxml2dom provides an API reminiscent of minidom, pxdom and other
Python-based and Python-related XML toolkits.
"""

http://cheeseshop.python.org/pypi/libxml2dom

[Alan Kennedy]
Will you accept Apache Xerces 2 for Java as a widely used DOM
Implementation?


[Fredrik Lundh] sure.

but libxml2 is also widely used, so we have at least two ways to interpret the spec.


Don't confuse libxml2dom with libxml2.

As I showed with a code snippet in a previous message, libxml2dom has
significant defects in relation to serialisation of namespaced
documents, whereby the serialised documents it produces aren't even
well-formed xml.

Perhaps you can show a code snippet in libxml2 that illustrates the
behaviour you describe?

--
alan kennedy
------------------------------------------------------
email alan: http://xhaus.com/contact/alan
Dec 6 '05 #25

P: n/a
Alan Kennedy wrote:
Libxml2dom is of alpha quality, according to its CheeseShop page anyway.


Given that I gave it that classification, let me explain that its alpha
status is primarily justified by the fact that it doesn't attempt to
cover the entire DOM API. As I mentioned in my original contribution to
this thread, the serialisation is done by libxml2 itself - arguably a
wise choice given the abysmal performance of many Python DOM
implementations when serialising documents.

I'll look into namespace-setting issues in the libxml2 API, but I
imagine that the serialisation mechanisms control much of what you're
seeing, and it's quite possible that they can be configured to perform
in whichever way is desirable.

Paul

Dec 6 '05 #26

P: n/a
Alan Kennedy wrote:
Don't confuse libxml2dom with libxml2.


Well, quite, but perhaps you can explain what I'm doing wrong with this
low-level version of the previously specified code:

import libxml2mod
document = libxml2mod.xmlNewDoc(None)
element = libxml2mod.xmlNewChild(document, None, "href", None)
print libxml2mod.serializeNode(document, None, 1)

This prints the following:

<?xml version="1.0"?>
<href/>

Extending the above code...

ns = libxml2mod.xmlNewNs(element, "DAV:", None)
print libxml2mod.serializeNode(document, None, 1)

This prints the following:

<?xml version="1.0"?>
<href xmlns="DAV:"/>

Note that libxml2mod is as close to the libxml2 C API as you can get in
Python. As far as I can tell, by using that module, you're effectively
driving the C API almost directly. Note also that libxml2mod is nothing
to do with what I've written myself - I'm just using it here, just as
libxml2dom does.

Now, in the first part of the code, we didn't specify a namespace on
the element at all, but in the second part we chose to set a namespace
on the element with a null prefix. As you can see, we get the xmlns
attribute as soon as the namespace is introduced. It is difficult to
say whether this usage of the API is correct or not, judging from the
Web site's material [1], so I'd be happy if someone could point out
improvements or corrections.

Paul

[1] http://xmlsoft.org/

Dec 6 '05 #27

P: n/a
Alan Kennedy wrote:
[Fredrik Lundh]
but isn't libxml2dom just a binding for libxml2? as I mention above, I had libxml2
in mind when I wrote "widely used", not the libxml2dom binding itself.


No, libxml2dom is Paul Boddie's DOM API compatibility layer on top of
the cpython bindings for libxml2.


So a binding that just passes things through to another binding is not
a binding? Alright, let's call it a compatibility layer then.
but libxml2 is also widely used, so we have at least two ways to
interpret the spec.


Don't confuse libxml2dom with libxml2.


As Paul has said several times, libxml2dom is just a thin API compatibility
layer on top of libxml2. It's libxml2 that does all the work, and the libxml2
authors claim that libxml2 implements the DOM level 2 document model,
but with a different API.

Maybe they're wrong, but wasn't the whole point of this subthread that
different developers have interpreted the specification in different ways ?

</F>

Dec 7 '05 #28

P: n/a
[Fredrik Lundh]
It's libxml2 that does all the work, and the libxml2
authors claim that libxml2 implements the DOM level 2 document model,
but with a different API.
That statement is meaningless.

The DOM is *only* an API, i.e. an interface. The opening statement on
the W3C DOM page is

"""
What is the Document Object Model?

The Document Object Model is a platform- and language-neutral interface
that will allow programs and scripts to dynamically access and update
the content, structure and style of documents.
"""

http://www.w3.org/DOM/

The interfaces that make up the different levels of the DOM are
described in CORBA IDL - Interface Definition Language.

DOM Implementations are free to implement the methods and properties of
the IDL interfaces as they see fit. Some implementations might maintain
an object model, with separate objects for each node in the tree,
several string variables associated with each node, i.e. node name,
namespace, etc. But they could just as easily store those data in
tables, indexed by some node id. (As an aside, the non-DOM-compatible
Xalan Table Model does exactly that:
http://xml.apache.org/xalan-j/dtm.html).

So when the libxml2 developers say (copied from http://www.xmlsoft.org/)

"""
To some extent libxml2 provides support for the following additional
specifications but doesn't claim to implement them completely:

* Document Object Model (DOM)
http://www.w3.org/TR/DOM-Level-2-Core/ the document model, but it
doesn't implement the API itself, gdome2 does this on top of libxml2
"""

They've completely missed the point: DOM is *only* the API.
Maybe they're wrong, but wasn't the whole point of this subthread that
different developers have interpreted the specification in different ways ?


What specification? Libxml2 implements none of the DOM specifications.

--
alan kennedy
------------------------------------------------------
email alan: http://xhaus.com/contact/alan
Dec 7 '05 #29

P: n/a
[Alan Kennedy]
Don't confuse libxml2dom with libxml2.

[Paul Boddie] Well, quite, but perhaps you can explain what I'm doing wrong with this
low-level version of the previously specified code:


Well, if your purpose is to make a point about minidom and DOM standards
compliance in relation to serialisation of namespaces, then what you're
doing wrong is to use a library that bears no relationship to the DOM to
make your point.

Think about it this way: Say you decide to create a new XML document
using a non-DOM library, such as the excellent ElementTree.

So you make a series of ElementTree-API-specific calls to create the
document, the elements, attributes, namespaces, etc, and then serialise
the whole thing.

And the end result is that you end up with a document that looks like this

"""
<?xml version="1.0" encoding="utf-8"?>
<href xmlns="DAV:"/>
"""

It is not possible to use that ElementTree code to make inferences on
how minidom should behave, because the syntax and semantics of the
minidom API calls and the ElementTree API calls are different.

Minidom is constrained to implement the precise semantics of the DOM
APIs, because it claims standards compliance.

ElementTree is free to do whatever it likes, e.g. be pythonic, because
it has no standard to conform to: it is designed solely according to the
experience and intuition of its author, who is free change it at any
stage if he feels like it.

s/ElementTree/libxml2/g

If I've completely missed your point and you were talking something else
entirely, please forgive me. I'd be happy to help with any questions if
I can.

--
alan kennedy
------------------------------------------------------
email alan: http://xhaus.com/contact/alan
Dec 7 '05 #30

P: n/a
Paul Boddie wrote:
It is difficult to say whether this usage of the API is correct or not, judging from the
Web site's material


[...]

Some more on this: I found an example on the libxml2 mailing list
(searching for "xmlNewNs default namespace") which is similar to the
one I gave:

http://mail.gnome.org/archives/xml/2.../msg00282.html

Meanwhile, the usage of xmlNewNs seems to have some correlation with
the production of xmlns attributes (found in a search for "xmlns
default namespace"):

http://mail.gnome.org/archives/xml/2.../msg00111.html

And whilst gdome2 - the GNOME project's DOM wrapper for libxml2 - seems
to create unowned namespaces, adding them to the document as global
namespace declarations (looking at the code for gdome_xmlNewNs and
gdome_xml_doc_createElementNS respectively)...

http://cvs.gnome.org/viewcvs/gdome2/...18&view=markup
http://cvs.gnome.org/viewcvs/gdome2/...50&view=markup

....seemingly comparable operations with libxml2mod seem to be no longer
supported:
libxml2mod.xmlNewGlobalNs(d, "DAV:", None)

xmlNewGlobalNs() deprecated function reached

Given that I've recently unsubscribed from some pretty unproductive
mailing lists, perhaps I should make some enquiries on the libxml2
mailing list and possibly report back.

Paul

Dec 7 '05 #31

P: n/a
Alan Kennedy wrote:
Well, if your purpose is to make a point about minidom and DOM standards
compliance in relation to serialisation of namespaces, then what you're
doing wrong is to use a library that bears no relationship to the DOM to
make your point.


Alright. I respectfully withdraw libxml2/libxml2dom as an example of a
DOM Level 2 compatible implementation. Since I only profess to support
"a PyXML-style DOM" in libxml2dom, the course I take in any amendments
to that package will follow whatever Uche decides to do with 4DOM and
PyXML. ;-) Whatever happens, I'll attempt to make it compatible with
qtxmldom in both its flavours (qtxml and KHTML).

As for the various issues with namespaces and the DOM, with memories of
slapping empty xmlns attributes strategically-but-desperately in XSL
processing pipelines to avoid invisible-but-still-present default
namespaces now thankfully receding into the incoherent past, the whole
business merely reinforces my impression of the various standards
committees as a group of corporate delegates meeting regularly to hold
a "measuring competition" amongst themselves.

Paul

Dec 7 '05 #32

P: n/a
Alan Kennedy wrote:

[Discussing the appearance of xmlns="DAV:"]
But that's incorrect. You have now defaulted the namespace to "DAV:" for
every unprefixed element that is a descendant of the href element.
[Code creating the no_ns element with namespaceURI set to None]
<?xml version="1.0"?>
<href xmlns="DAV:"><no_ns/></href>


I must admit that I was focusing on the first issue rather than this
one, even though it is related, when I responded before. Moreover,
libxml2dom really should respect the lack of a namespace on the no_ns
element, which the current version unfortunately doesn't do. However,
wouldn't the correct serialisation of the document be as follows?

<?xml version="1.0"?>
<href xmlns="DAV:"><no_ns xmlns=""/></href>

As for the first issue - the presence of the xmlns attribute in the
serialised document - I'd be interested to hear whether it is
considered acceptable to parse the serialised document and to find that
no non-null namespaceURI is set on the href element, given that such a
namespaceURI was set when the document was created. In other words, ...

document = libxml2dom.createDocument(None, "doc", None)
top = document.xpath("*")[0]
elem1 = document.createElementNS("DAV:", "href")
document.replaceChild(elem1, top)
elem2 = document.createElementNS(None, "no_ns")
document.xpath("*")[0].appendChild(elem2)
document.toFile(open("test_ns.xml", "wb"))

....as before, followed by this test:

document = libxml2dom.parse("test_ns.xml")
print "Namespace is", repr(document.xpath("*")[0].namespaceURI)

What should the "Namespace is" message produce?

Paul

Dec 11 '05 #33

P: n/a
[Paul Boddie]
However,
wouldn't the correct serialisation of the document be as follows?

<?xml version="1.0"?>
<href xmlns="DAV:"><no_ns xmlns=""/></href>
Yes, the correct way to override a default namespace is an xmlns=""
attribute.

[Paul Boddie] As for the first issue - the presence of the xmlns attribute in the
serialised document - I'd be interested to hear whether it is
considered acceptable to parse the serialised document and to find that
no non-null namespaceURI is set on the href element, given that such a
namespaceURI was set when the document was created.
The key issue: should the serialised-then-reparsed document have the
same DOM "content" (XML InfoSet) if the user did not explicitly create
the requisite namespace declaration attributes?

My answer: No, it should not be the same.
My reasoning: The user did not explicitly create the attributes
=> The DOM should not automagically create them (according to
the L2 spec)
=> such attributes should not be serialised
- The user didn't create them
- The DOM implementation didn't create them
- If the serialisation processor creates them, that gives the
same end result as if the DOM impl had (wrongly) created them.
=> the serialisation is a faithful/naive representation of the
(not-namespace-well-formed) DOM constructed by the user (who
omitted required attributes).
=> The reloaded document is a different DOM to the original, i.e.
it has a different infoset.

The xerces and jython snippet I posted the other day demonstrates this.
If you look closely at that code, the actual DOM implementation and the
serialisation processor used are from different libraries. The DOM is
the inbuilt JAXP DOM implementation, Apache Crimson(the example only
works on JDK 1.4). The serialisation processor is the Apache Xerces
serialiser. The fact that the xmlns="DAV:" attribute didn't appear in
the output document shows that BOTH the (Crimson) DOM implementation AND
the (Xerces) serialiser chose NOT to automagically create the attribute.

If you run that snippet with other DOM implementations, by setting the
"javax.xml.parsers.DocumentBuilderFactory" property, you'll find the
same result.

Serialisation and namespace normalisation are both in the realm of DOM
Level 3, whereas minidom is only L2 compliant. Automagically introducing
L3 semantics into the L2 implementation is the wrong thing to do.

http://www.w3.org/TR/DOM-Level-3-LS/load-save.html
http://www.w3.org/TR/2004/REC-DOM-Le...lgorithms.html

[Paul Boddie] In other words, ...

What should the "Namespace is" message produce?
Namespace is None

If you want it to produce,

Namespace is 'DAV:'

and for your code to be portable to other DOM implementations besides
libxml2dom, then your code should look like:-
document = libxml2dom.createDocument(None, "doc", None)
top = document.xpath("*")[0]
elem1 = document.createElementNS("DAV:", "href")
elem1.setAttributeNS(xml.dom.XMLNS_NAMESPACE, "xmlns", "DAV:")
document.replaceChild(elem1, top)
elem2 = document.createElementNS(None, "no_ns")
elem2.setAttributeNS(xml.dom.XMLNS_NAMESPACE, "xmlns", "")
document.xpath("*")[0].appendChild(elem2)
document.toFile(open("test_ns.xml", "wb"))


its-not-about-namespaces-its-about-automagic-ly'yrs,

--
alan kennedy
------------------------------------------------------
email alan: http://xhaus.com/contact/alan
Dec 11 '05 #34

P: n/a
Alan Kennedy wrote:
Serialisation and namespace normalisation are both in the realm of DOM
Level 3, whereas minidom is only L2 compliant. Automagically introducing
L3 semantics into the L2 implementation is the wrong thing to do.
I think I'll have to either add some configuration support, in order to
let the user specify which standards they have in mind, or to
deny/assert support for one or another of the standards. It's
interesting that minidom plus PrettyPrint seems to generate the xmlns
attributes in the serialisation, though; should that be reported as a
bug?

As for the toxml method in minidom, the subject did seem to be briefly
discussed on the XML-SIG mailing list earlier in the year:

http://mail.python.org/pipermail/xml...ly/011157.html
its-not-about-namespaces-its-about-automagic-ly'yrs


Well, with the automagic, all DOM users get the once in a lifetime
chance to exchange those lead boots for concrete ones. I'm sure there
are all sorts of interesting reasons for assigning namespaces to nodes,
serialising the document, and then not getting all the document
information back when parsing it, but I'd rather be spared all the
"amusement" behind all those reasons and just have life made easier for
just about everyone concerned. I think the closing remarks in the
following message say it pretty well:

http://mail-archives.apache.org/mod_mbox/xml-security-dev/200409.mbox/<1095071819.17967.44.camel%40amida>

And there are some interesting comments on this archived page, too:

http://web.archive.org/web/20010211173643/http://xmlbastard.editthispage.com/discuss/msgReader$6

Anyway, thank you for your helpful commentary on this matter!

Paul

Dec 11 '05 #35

P: n/a
[Paul Boddie]
It's
interesting that minidom plus PrettyPrint seems to generate the xmlns
attributes in the serialisation, though; should that be reported as a
bug?
I believe that it is a bug.

[Paul Boddie] Well, with the automagic, all DOM users get the once in a lifetime
chance to exchange those lead boots for concrete ones. I'm sure there
are all sorts of interesting reasons for assigning namespaces to nodes,
serialising the document, and then not getting all the document
information back when parsing it, but I'd rather be spared all the
"amusement" behind all those reasons and just have life made easier for
just about everyone concerned.
Well, if you have a fair amount of spare time and really want to improve
things, I recommend that you consider implementing the DOM L3 namespace
normalisation algorithm.

http://www.w3.org/TR/2004/REC-DOM-Le...lgorithms.html

That way, everyone can have namespace well-formed documents by simply
calling a single method, and not a line of automagic in sight: just
standards-compliant XML processing.
Anyway, thank you for your helpful commentary on this matter!


And thanks to you for actually informing yourself on the issue, and for
taking the time to research and understand it. I wish that your
refreshing attitude was more widespread!

now-i-really-must-get-back-to-work-ly'yrs,

--
alan kennedy
------------------------------------------------------
email alan: http://xhaus.com/contact/alan
Dec 12 '05 #36

P: n/a
Uche Ogbuji <uc*********@gmail.com> wrote:
Andrew Clover also suggested an overly-legalistic argument that current
minidom behavior is not a bug.
I stick by my language-law interpretation of spec. DOM 2 Core
specifically disclaims any responsibility for namespace fixup and
advises the application writer to do it themselves if they want to be
sure of the right output. W3C knew they weren't going to get all that
standardised by Level 2 so they left it open for future work - if
minidom claimed to support DOM 3 LS it would be a different matter.
'<?xml version="1.0" ?>\n<ferh/>' (i.e. "ferh" rather than "href"), would you not consider that a minidom
bug?
It's not a *spec* bug, as no spec that minidom claims to conform to
says anything about serialisation. It's a *minidom* bug in that it
fails to conform to the minimal documentation of the method toxml()
which claims to "Return the XML that the DOM represents as a string" -
the DOM does not represent that XML.

However that doc for toxml() says nothing about being namespace-aware.
XML and XML-with-namespaces both still exist, and for the former class
of document the minidom behaviour is correct.
The reality is that once the poor user has done: element = document.createElementNS("DAV:", "href") They are following DOM specification that they have created an element
in a namespace
It's possible that a namespaced node could also be imported/parsed into
a non-namespace document and then serialised; it's particularly likely
this could happen for scripts processing XHTML.

We shouldn't change the existing behaviour for toxml/writexml because
people may be relying on it. One of the reasons I ended up writing a
replacement was that the behaviour of minidom was not only wrong, but
kept changing under my feet with each version.

However, adding the ability to do fixup on serialisation would indeed
be very welcome - toxmlns() maybe, or toxml(namespaces= True)?
I'll be sure to emphasize heavily to users that minidom is broken
with respect to Namespaces and serialization, and that they
abandon it in favor of third-party tools.
Well yes... there are in any case more fundamental bugs than just
serialisation problems.

Frederik wrote:
can anyone perhaps dig up a DOM L2 implementation that's not written
by anyone involved in this thread


<g>

--
And Clover
mailto:an*@doxdesk.com
http://doxdesk.com/

Dec 19 '05 #37

This discussion thread is closed

Replies have been disabled for this discussion.