DOM2 API (Java): how to get namespace declarations?

Simon Brooke

I was debugging a new XML generator tonight and trying to determine why
it wasn't working; and realised my dom printer does not output XML
namespace declarations.

My method to output an Element is as follows:

/**
* Print an element node, and, by recursive descent, it's children
*
* @param node the node to print
* @param out the stream to print it on
* @param url the base URL to use in expanding relative URLs
* @param level the indentation level if pretty printing
*/
protected void print( Element node, PrintStream out, URL url,
int level )
throws IOException
{
indent( out, level );
out.print( '<' );

String tagname = node.getNodeName( );
out.print( tagname );

NamedNodeMap attrs = node.getAttributes( );
NodeList children = node.getChildNodes( );

/**
* Get the attributes of the node and print their values.
*/
for ( int i = 0; i < attrs.getLength( ); i++ )
{
print( ( (Attr) attrs.item( i ) ), out, url, level + 1 );
}

if ( ( children != null ) && ( children.getLength( ) > 0 ) )
{ // it's a non-empty tag
out.print( '>' );

int len = children.getLength( );

for ( int i = 0; i < len; i++ )
{
print( children.item( i ), out, url, level + 1 );
}

/**
* Set the end tag.
*/
indent( out, level );
out.print( '<' );
out.print( '/' );
out.print( tagname );
}
else // it's an empty tag
{
out.print( " /" );
}

out.print( '>' );
}

Performing the exact same XSL transform, the Xerces printer emits:

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF version="1.0"
xmlns:syn="http://purl.org/rss/1.0/modules/syndication/"
xmlns:geourl="http://geourl.org/rss/module/"
xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rss version="0.91">
...

whereas my printer emits:

<rdf:RDF version="1.0">
<rss version="0.91">
...

The relevant part of the XSL file reads:

<xsl:template match="category">
<rdf:RDF version="1.0"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#"
xmlns:geourl="http://geourl.org/rss/module/"
xmlns:syn="http://purl.org/rss/1.0/modules/syndication/">
<rss version="0.91">
...

Clearly what Xerces is emitting is right and what I am emitting is wrong,
but I'm having trouble seeing what I'm doing wrong. My method to output
an attribute node is as follows:

/**
* Print an attribute node. If url is not null, use it as a base URL
* for expanding URL values.
*
* @param node the node to print
* @param out the stream to print it on
* @param url the base URL to use in expanding relative URLs
* @param level the indentation level if pretty printing
*/
protected void print( Attr node, PrintStream out, URL url,
int level )
throws IOException
{
String delimiter = "\"";
String value = node.getNodeValue( );

if ( value != null )
{
/* As I understand it, you aren't allowed unvalued
* attributes in XML
*/
value = cleanString( value, true );
/* are attribute values allowed to contain *any*
* characters? */

if ( value.indexOf( delimiter ) > -1 )
/* if an attribute has double quotes in it's value, we'll use
* single quotes as the delimiter and vice versa. If it has
* both we're stuffed. */
{
delimiter = "'";
}

indent( out, level );
out.print( " " );
out.print( node.getNodeName( ) );
out.print( "=" );
out.print( delimiter );

/* If this is an attribute whose value
* should be a URL. */
if ( ( node.getNodeName( ).equalsIgnoreCase( "href" ) ||
node.getNodeName( ).equalsIgnoreCase( "link" ) ||
node.getNodeName( ).equalsIgnoreCase( "src" ) ) &&
( url != null ) )
{
/* Change the partial URL to a full URL. */
try
{
String fullURL = new URL( url, value ).toString( );

out.print( fullURL );
}
catch ( MalformedURLException m )
{
// log
m.printStackTrace();
}
}
else
{ /* If I've got a value, clean it and
* print it. */
out.print( value );
}

out.print( delimiter );
}
else
{
System.err.println( "Unvalued attribute: " +
node.getNodeName( ));
}
}

Neither the MalformedURLException nor the string 'Unvalued attribute'
ever appear in the log. From this it seems that neither
Node.getAttributes() nor Node.getChildNodes() return the namespace
declarations. Yet I can't see any other no-args get...() method in the
API. Reading through the Xerces XMLSerializer code makes is seem that
they are finding the namespace declarations among the attributes.

Can anyone see what I'm doing wrong? I appreciate it probably some basic
howler, but I just can't see it myself.

--
si***@jasmine.org.uk (Simon Brooke) http://www.jasmine.org.uk/~simon/

Hobbit ringleader gives Sauron One in the Eye.

Feb 11 '06 #1

Subscribe Post Reply

2009

Joe Kesselman

Simon Brooke wrote:

I was debugging a new XML generator tonight and trying to determine why
it wasn't working; and realised my dom printer does not output XML
namespace declarations.

XML namespace declarations are optional in the DOM, since every node
carries its namespace and bindings can be reconstructed when you
serialize the DOM's contents as XML. The flipside is that it is the
serializer's responsibility to check that the necessary declarations are
present as Attribute nodes, and/or to synthesize those declarations.

The DOM Level 3 spec should have a fairly detailed description of one
algorithm for doing that check and fixup. (I drafted the first version
of that logic, though I think it's been tweaked a bit since then.) I'd
suggest reading that before implementing your own DOM-printer.

Alternatively, you can insist that whoever constructs your DOM take
responsibility for making sure that all the necessary Attribute nodes
exist to declare the namespaces. (Note that they have to be in the
correct namespace themselves...). But it's probably better not to count
on that unless you have full control of both sides of the system.

Note that most DOM implementations these days ship with serializers that
know how to do the right things, so unless you're creating your own DOM
or have unusual formatting requirements it might be simpler to just use
those rather than reimplementing that code. (And of course DOM Level 3
proposes a standard API for that function.)

But doing a recursive-descent DOM printer _is_ a good learning exercise,
so it's probably something you should write at least once. Among other
things, the same tree-walking logic is useful for many other kinds of
DOM processing.

Feb 11 '06 #2

Simon Brooke

in message <eN******************************@comcast.com>, Joe Kesselman
('k*************@comcast.net') wrote:

Simon Brooke wrote:
I was debugging a new XML generator tonight and trying to determine
why it wasn't working; and realised my dom printer does not output XML
namespace declarations.
XML namespace declarations are optional in the DOM, since every node
carries its namespace and bindings can be reconstructed when you
serialize the DOM's contents as XML. The flipside is that it is the
serializer's responsibility to check that the necessary declarations
are present as Attribute nodes, and/or to synthesize those
declarations.

Thanks very much!
The DOM Level 3 spec should have a fairly detailed description of one
algorithm for doing that check and fixup. (I drafted the first version
of that logic, though I think it's been tweaked a bit since then.) I'd
suggest reading that before implementing your own DOM-printer.
OK, got it.
<URL:http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/namespaces-algorithms.html>
Note that most DOM implementations these days ship with serializers
that know how to do the right things, so unless you're creating your
own DOM or have unusual formatting requirements it might be simpler to
just use those rather than reimplementing that code. (And of course DOM
Level 3 proposes a standard API for that function.)

Yup. The thing is I wrote my printer back in February 2000 when there
weren't a lot of others around - which makes it surprising that it's
failure to do the right things with namespaces hasn't tripped me up
before. It would probably be more economic now to just make a call to
the DOM3 serialiser API, but as a matter of craftsmanship I'd like to
get mine right.

OK, so: we look at a node and see if it needs a namespace, and if it does
we generate a namespace declaration. Suppose we have a structure

1 <a>
2 
3 <foo:c/>
4 <foo:d/>
5 
6 <bar:e/>
7 </a>

am I right in thinking that it would be correct to attach the 'foo'
namespace declaration at any of nodes c /and/ d, or at node b, or at
node a, and the 'bar' namespace declaration at either node e or node a?

Clearly not duplicating the declaration makes the job of the parser
easier. Is there any good reason not to pre-scan the tree an collect all
of the namespaces used and declare them on the root element of the
document? Looking at the 'algorithms' page it seems that unless two
elements use the same prefix to indicate different namespaces, there
should be no problem in 'shuffling' namespace declaration as high up the
tree as possible.

--
si***@jasmine.org.uk (Simon Brooke) http://www.jasmine.org.uk/~simon/

;; When all else fails, read the distractions.

Feb 11 '06 #3

Bjoern Hoehrmann

* Simon Brooke wrote in comp.text.xml:

OK, so: we look at a node and see if it needs a namespace, and if it does
we generate a namespace declaration. Suppose we have a structure

1 <a>
2 
3 <foo:c/>
4 <foo:d/>
5 
6 <bar:e/>
7 </a>

am I right in thinking that it would be correct to attach the 'foo'
namespace declaration at any of nodes c /and/ d, or at node b, or at
node a, and the 'bar' namespace declaration at either node e or node a?
xmlns:foo must be in scope of c and d, adding them there would do the
job, as well as adding them to one of the ancestors. Adding them to
a,b,c,d would also be possible, for example, but probably be redundant.
Note that 'foo' might map to different namespace names on different
elements, e.g.

<x>
<y:z xmlns:y='foo' />
<y:z xmlns:y='bar' />
</x>

would also be possible and there might be content that depends on the
prefixes (e.g., XPath expressions in a XSLT document), so if you have

<x some-qname-attribute='y:z' xmlns:y='foo'>
<y:example />
</x>

mapping that to

<x some-qname-attribute='y:z'>
<y:example xmlns:y='foo' />
</x>

might be a bad idea.
Clearly not duplicating the declaration makes the job of the parser
easier. Is there any good reason not to pre-scan the tree an collect all
of the namespaces used and declare them on the root element of the
document? Looking at the 'algorithms' page it seems that unless two
elements use the same prefix to indicate different namespaces, there
should be no problem in 'shuffling' namespace declaration as high up the
tree as possible.

This is true in general, but it would turn a probably incorrect document
like

<x some-qname-attribute='y:z'>
<y:example xmlns:y='foo' />
</x>

into a correct document, which might not be intended. Of course, QNames
in content might not be a concern for your application.
--
Björn Höhrmann · mailto:bj****@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/

Feb 11 '06 #4

Simon Brooke

in message <s1********************************@hive.bjoern.ho ehrmann.de>,
Bjoern Hoehrmann ('b*****@hoehrmann.de') wrote:

* Simon Brooke wrote in comp.text.xml:
OK, so: we look at a node and see if it needs a namespace, and if it
does we generate a namespace declaration. Suppose we have a structure

1 <a>
2 
3 <foo:c/>
4 <foo:d/>
5 
6 <bar:e/>
7 </a>

am I right in thinking that it would be correct to attach the 'foo'
namespace declaration at any of nodes c /and/ d, or at node b, or at
node a, and the 'bar' namespace declaration at either node e or node a?

xmlns:foo must be in scope of c and d, adding them there would do the
job, as well as adding them to one of the ancestors. Adding them to
a,b,c,d would also be possible, for example, but probably be redundant.
Note that 'foo' might map to different namespace names on different
elements, e.g.

<x>
<y:z xmlns:y='foo' />
<y:z xmlns:y='bar' />
</x>

would also be possible and there might be content that depends on the
prefixes (e.g., XPath expressions in a XSLT document), so if you have

<x some-qname-attribute='y:z' xmlns:y='foo'>
<y:example />
</x>

mapping that to

<x some-qname-attribute='y:z'>
<y:example xmlns:y='foo' />
</x>

might be a bad idea.
Clearly not duplicating the declaration makes the job of the parser
easier. Is there any good reason not to pre-scan the tree an collect
all of the namespaces used and declare them on the root element of the
document? Looking at the 'algorithms' page it seems that unless two
elements use the same prefix to indicate different namespaces, there
should be no problem in 'shuffling' namespace declaration as high up
the tree as possible.

This is true in general, but it would turn a probably incorrect
document like

<x some-qname-attribute='y:z'>
<y:example xmlns:y='foo' />
</x>

into a correct document, which might not be intended. Of course, QNames
in content might not be a concern for your application.

OK, my algorithm at this stage is as follows

if ( responsibleForNamespaceDeclarations )
{
try
{
spaces = recursivelyCollectNamespaces( node );

Enumeration keys = spaces.keys( );

while ( keys.hasMoreElements( ) )
{
String key = keys.nextElement( ).toString( );
printNS( key, spaces.get( key ).toString( ), out,
level + 1 );
}

responsibleForNamespaceDeclarations = false;
}
catch ( NamespaceCollisionException e )
{
String uri = node.getNamespaceURI( );
String prefix = node.getPrefix( );

if ( ( uri != null ) && ( prefix != null ) )
{
printNS( prefix, uri, out, level + 1);
}

System.err.println( "Namespace clash: " + e.getMessage( ) );
}
}
...
for ( int i = 0; i < children.length(); i++ )
{
print( children.item( i ), out, level + 1,
responsibleForNamespaceDeclarations );
}

That is to say, when printing an element node, I do recursive descent to
collect all the namespaces down tree from it. If there is a collision,
then if I have a local namespace to deal with, I deal with that locally,
and leave responsibility for printing namespaces set for the child
nodes. If there is no collision, then I deal with all the down-tree
namespaces and clear the responsibleForNamespaceDeclarations flag.

Can anyone see problems with this? And what do I do about the default
namespace? Will the default namespace have getNamespaceURI() non-null
and getPrefix() null?

--
si***@jasmine.org.uk (Simon Brooke) http://www.jasmine.org.uk/~simon/

The Conservative Party is now dead. The corpse may still be
twitching, but resurrection is not an option - unless Satan
chucks them out of Hell as too objectionable even for him.

Feb 11 '06 #5

Bjoern Hoehrmann

* Simon Brooke wrote in comp.text.xml:

Can anyone see problems with this? And what do I do about the default
namespace? Will the default namespace have getNamespaceURI() non-null
and getPrefix() null?

http://lists.w3.org/Archives/Public/...5Dec/0017.html
--
Björn Höhrmann · mailto:bj****@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/

Feb 11 '06 #6

Joe Kesselman

Simon Brooke wrote:

? Will the default namespace have getNamespaceURI() non-null
and getPrefix() null?

Yes.

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry

Feb 11 '06 #7

Simon Brooke

in message <Cf********************@comcast.com>, Joe Kesselman
('k*************@comcast.net') wrote:

Simon Brooke wrote:
? Will the default namespace have getNamespaceURI() non-null
and getPrefix() null?

Yes.

Thanks.

--
si***@jasmine.org.uk (Simon Brooke) http://www.jasmine.org.uk/~simon/

;; no eternal reward will forgive us now for wasting the dawn.
;; Jim Morrison

Feb 11 '06 #8

Simon Brooke

in message <Cf********************@comcast.com>, Joe Kesselman
('k*************@comcast.net') wrote:

Simon Brooke wrote:
? Will the default namespace have getNamespaceURI() non-null
and getPrefix() null?

Yes.

Thanks.

[did I reply to this already?]

--
si***@jasmine.org.uk (Simon Brooke) http://www.jasmine.org.uk/~simon/
Iraq war: it's time for regime change...
... go now, Tony, while you can still go with dignity.
[update 18 months after this .sig was written: it's still relevant]

Feb 11 '06 #9

Similar topics

Multi-file project with Namespace

by: Anonymous | last post by:

How do I reference a namespace variable in a multi-file project? Do I use the keyword 'extern'? If so, does the word 'extern' modify the namespace or the individual variables within the...

C / C++

default namespace prefix

by: indo3 | last post by:

HELLO For the top root element of my schema file, i want to declare following attribute: <xs:attribute name="xmlns:m" type="xs:string" default="http://www.w3.org/1998/Math/MathML"/> But...

.NET Framework

default namespace and xmlns=""

by: Mike Dickens | last post by:

hi, i'm sure this has come up before but havn't managed to find an answer. if i have the following xslt <?xml version="1.0" encoding="ISO-8859-1"?> <xsl:stylesheet method="xml" version="1.0"...

.NET Framework

133

Java's performance far better that optimized C++

by: Gaurav | last post by:

http://www.sys-con.com/story/print.cfm?storyid=45250 Any comments? Thanks Gaurav

C / C++

How do I define my own namespace?

by: JustSomeGuy | last post by:

I am writing classes and I want them to belong to mynamespace What is the syntax to say that the class I'm defining is a member of mynamespace? What is the scope of the syntax and how does one...

C / C++

Namespace and #Include best practises

by: Jon Rea | last post by:

I am currently cleaning up an application which was origainlly hashed together with speed of coding in mind and therefore contains quite a few "hacky" shortcuts. As part of this "revamping"...

C / C++

Why not add namespace feature into standard C?

by: toolmaster | last post by:

Since many of the modern computer languages have built-in namespace features, I can't understand why not add this feature into standard C. I've heard many people complain of the lacking of...

C / C++

What is the equivalent of a Java package in .Net

by: vivekian | last post by:

Hi , New to dot net and C# and was wondering what is the equivalent of a java package in dot net ? thanks, vivekian

.NET Framework

removing spurious namespace declarations on XSLT output

by: Andy Fish | last post by:

hi, I have an XSLT which is producing XML output. many of the nodes in the output tree contain namespace declarations for namespaces that are used in the source document even though they are...

.NET Framework

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General