472,143 Members | 1,334 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,143 software developers and data experts.

Document.importNode(Node,boolean) - what supports it?

The DOM API has included public Node importNode(Node,boolean) as a method
of the Document interface for a long time. Does anything actually
implement it? Xerces 2 is giving me:

org.w3c.dom.DOMException: NOT_SUPPORTED_ERR: The implementation does not
support the requested type of object or operation.
at org.apache.xerces.dom.CoreDocumentImpl.importNode( Unknown
Source)
at org.apache.xerces.dom.CoreDocumentImpl.importNode( Unknown
Source)
at
uk.co.weft.domutil.MaybeParseGenerator.maybeParse( MaybeParseGenerator.java:183)

This is so whether the node I'm trying to import is an
org.apache.xerces.dom.DeferredElementImpl (i.e. parsed with Xerces) or a
org.apache.crimson.tree.ElementNode (i.e. parsed with Crimson).

--
si***@jasmine.org.uk (Simon Brooke) http://www.jasmine.org.uk/~simon/
Ye hypocrites! are these your pranks? To murder men and give God thanks?
Desist, for shame! Proceed no further: God won't accept your thanks for
murther
-- Robert Burns, 'Thanksgiving For a National Victory'

Mar 16 '07 #1
10 8595
Simon Brooke wrote:
The DOM API has included public Node importNode(Node,boolean) as a method
of the Document interface for a long time. Does anything actually
implement it?
Certainly should work; I wrote Xerces' first implementation of that
function, and in fact was one of those who lobbied the DOM WG to include
it in the standard. If the node being imported properly implements the
DOM APIs, and the implementation being imported into doesn't have some
reason for blocking this (eg, that it's specifically a read-only DOM,
such as the DOM view of Xalan's internal data model), the function
should work. It isn't rocket science, after all; it's just a tree-walker
feeding a tree-builder.

I have to believe the problem resides in something you haven't told us.

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Mar 16 '07 #2
in message <iJ******************************@comcast.com>, Joe Kesselman
('k*************@comcast.net') wrote:
Simon Brooke wrote:
>The DOM API has included public Node importNode(Node,boolean) as a
method of the Document interface for a long time. Does anything actually
implement it?

Certainly should work; I wrote Xerces' first implementation of that
function, and in fact was one of those who lobbied the DOM WG to include
it in the standard. If the node being imported properly implements the
DOM APIs, and the implementation being imported into doesn't have some
reason for blocking this (eg, that it's specifically a read-only DOM,
such as the DOM view of Xalan's internal data model), the function
should work. It isn't rocket science, after all; it's just a tree-walker
feeding a tree-builder.

I have to believe the problem resides in something you haven't told us.
OK, then I have to believe that, too. Furthermore, this is another of the
bits of my code that have been around for a long time (since 2003 in this
case), and I'm sure it used to work (but it may only ever have worked with
Crimson). I have had occasions in the past where I have inadvertently
depended on bugs in a library, and when that library has been fixed all my
code broke.

If this class fails, it returns a text node with a 'flat' representation of
the embedded markup. Looking at the production server logs I see that it
has been intermittently failing in this way for some time, but that the
failure simply has not been noticed. The failure on the production servers
is different from the failure on the development server, I'll detail that
difference below. The production severs use Crimson to parse, but Xerces
to construct documents - I can't remember why, but probably just an
oversight.

The class in question is:

//************************************************** *********************\
// *
// MaybeParseGenerator.java *
// *
// Author: Simon Brooke *
// Created: 17th January 2003 *
// $Revision: 1.7.4.3 $; $Date: 2006/09/04 13:45:54 $ *
// *
//************************************************** *********************/
package uk.co.weft.domutil;

import org.w3c.dom.Document;
import org.w3c.dom.Node;

import org.xml.sax.InputSource;

import java.io.StringReader;

import javax.xml.parsers.DocumentBuilder;

import uk.co.weft.htform.ResourceConsumerImpl;
/*
* $Log: MaybeParseGenerator.java,v $
* Revision 1.7.4.3 2006/09/04 13:45:54 simon
* Added more debugging output. Have an intermittent bug in PRES which may
originate here.
*
* Revision 1.7.4.2 2005/12/30 16:54:00 simon
* EkitWidget now working remarkably well. Still some tidying up to do.
*
* Revision 1.7.4.1 2005/12/23 10:48:33 simon
* Brute force tidy up after CVS server crash: this time it should work.
*
* Revision 1.7 2005/02/05 17:40:17 simon
* Improved diagnostics on failure
*
* Revision 1.6 2004/07/14 12:52:34 simon
* Final commit for 1.10.0
*
* Revision 1.5 2004/06/17 15:10:38 simon
* Extends ResourceConsumerImpl to gain access to grs, etc
*
* Revision 1.4 2003/10/30 12:40:21 simon
* Added debug flag in domutil classes
*
* Revision 1.3 2003/08/20 09:38:35 simon
* Code cleanup with eclipse; mostly removal of exccessive includes
*
* Revision 1.2 2003/07/09 09:32:07 simon
* Initial work on HTML generation of widgets.
*
* Revision 1.1 2003/02/06 11:22:26 simon
* New superclass for node generators which may want to parse XML text.
*/

/**
* Abstract superclass for TextNodeGenerator and ElementGenerator, which
may
* want to parse their content. Parsing is potentially expensive, so if
* you're confident the value won't contain XML markup it may be worth
* setting allowEmbeddeMarkup( false).
*
* @author Simon Brooke
* @version $Revision: 1.7.4.3 $ This revision: $Author: simon $
*/
public abstract class MaybeParseGenerator extends ResourceConsumerImpl
{
//~ Instance fields -----------------------------------------------------

/**
* whether or not I'm in debug mode; if I am I may print debugging
* messages to System.err
*/
protected boolean debug = false;

/** By default we allow embedded markup in children */
protected boolean embeddedMarkup = true;

//~ Constructors --------------------------------------------------------

/**
* Creates a new MaybeParseGenerator object.
*/
public MaybeParseGenerator( )
{
// ...nothing...
}

//~ Methods -------------------------------------------------------------

/**
* whether or not to set debugging mode. If true, the generator _may_
* write debugging messages to System.err
*
* @param debug whether or not to set debugging mode
*
* @since Jacquard 1.10
*/
public void setDebug( boolean debug )
{
this.debug = debug;
}

/**
* Do we allow (and parse for) embedded markup within the value of this
* node? default is we do.
*
* @param allow if true, then allow embedded markup within my value
*/
public void allowEmbeddedMarkup( boolean allow )
{
embeddedMarkup = allow;
}

/**
* Construct a node representing this value. It's perfectly possible (and
* possibly legitimate) that the value of a child should contain embedded
* markup. If so, try to parse a node out of it.
*
* @param doc the document in which the node is to be created
* @param unparsed the string, possibly with embedded markup, to parse
*
* @exception GenerationException if parsing fails
*/
protected Node maybeParse( Document doc, String unparsed )
throws GenerationException
{
Node val = doc.createTextNode( unparsed ); // safe default

if ( debug )
{
System.err.println( "MaybeParseGenerator.maybeParse: parsing [" +
unparsed + "]" );
}

if ( unparsed != null ) // defensive
{
if ( embeddedMarkup && (
// if we allow embedded markup
unparsed.indexOf( "<" ) -1 ) ) // it looks like markup
{
if ( !unparsed.trim( ).startsWith( "<" ) )
{
// nasty: if it contains markup, but
// isn't contained in markup, the
// parser will barf.
unparsed = "<parsed>" + unparsed + "</parsed>";
}

try
{
DocumentBuilder parser = DOMStub.getParser( );

if ( parser == null )
{
System.err.println( "Could not initialise XML parser" );
}

InputSource i =
new InputSource( new StringReader( unparsed ) );

// i.setCharacterStream( new StringReader( unparsed ) );
Document parsed = parser.parse( i );

if ( debug )
{
System.err.println( "Parsed document: " +
parsed.toString( ) );

if ( parsed != null )
{
Node root = parsed.getDocumentElement( );

if ( root != null )
{
System.err.println( "Root node: (" +
root.getClass( ).getName( ) + "): " +
root.toString( ) );
}
}
}

val = doc.importNode( parsed, true );

if ( debug )
{
System.err.println(
"MaybeParseGenerator.maybeParse: parse successful" );
new Printer( ).print( val, System.err );
}
}
catch ( Exception e )
{
System.err.println(
"MaybeParseGenerator.maybeParse(): Could not parse '" +
unparsed + "'as XML" );
e.printStackTrace( System.err );
}
}
}

return val;
}
}

/* [end of file] */
What I'm getting in the error stream on the development server is (with
parser unconfigured, i.e. using Tomcat's default, which is Xerces; see
below for Crimson):

ElementGenerator.generate: attempting to parse <div class="Intro">
Here be dragons!
</div>
MaybeParseGenerator.maybeParse: parsing [<div class="Intro">
Here be dragons!
</div>]
Parsed document: [#document: null]
Root node: (org.apache.xerces.dom.DeferredElementImpl): [div: null]
MaybeParseGenerator.maybeParse(): Could not parse '<div class="Intro">
Here be dragons!
</div>'as XML
org.w3c.dom.DOMException: NOT_SUPPORTED_ERR: The implementation does not
support the requested type of object or operation.
at org.apache.xerces.dom.CoreDocumentImpl.importNode( Unknown Source)
at org.apache.xerces.dom.CoreDocumentImpl.importNode( Unknown Source)
at
uk.co.weft.domutil.MaybeParseGenerator.maybeParse( MaybeParseGenerator.java:183)
(with parser configured as org.apache.crimson.tree.DOMImplementationImpl):

ElementGenerator.generate: attempting to parse <div class="Intro">
Here be dragons!
</div>
MaybeParseGenerator.maybeParse: parsing [<div class="Intro">
Here be dragons!
</div>]
Parsed document: org.apache.crimson.tree.XmlDocument@e9a0e9a
Root node: <div class="Intro">
Here be dragons!
</div>
MaybeParseGenerator.maybeParse(): Could not parse '<div class="Intro">
Here be dragons!
</div>'as XML
org.w3c.dom.DOMException: NOT_SUPPORTED_ERR: The implementation does not
support the requested type of object or operation.
at org.apache.xerces.dom.CoreDocumentImpl.importNode( Unknown Source)
at org.apache.xerces.dom.CoreDocumentImpl.importNode( Unknown Source)
at
uk.co.weft.domutil.MaybeParseGenerator.maybeParse( MaybeParseGenerator.java:173)
What's showing up in the production server logs is:
(Firstly, evidence that it sometimes does work):
ElementGenerator.generate: attempting to parse <div
class="Introduction"><p>Copies of documentation issued to licensees is
available in this section.</p></div>
ElementGenerator.generate: attempting to parse Cockle Bags - further
information
(Secondly, evidence that it sometimes doesn't):
ElementGenerator.generate: attempting to parse <div class="Introduction">
Ayrshire and Dumfrieshire Cyclists Association is a regional
association
of cycling clubs within the structure of Scottish Cycling.
</div>
MayberParseGenerator.maybeParse(): Could not parse '<div
class="Introduction">
Ayrshire and Dumfrieshire Cyclists Association is a regional
association
of cycling clubs within the structure of Scottish Cycling.
</div>'as XML
java.lang.NullPointerException
at org.apache.xerces.dom.CoreDocumentImpl.importNode( Unknown
Source)
at org.apache.xerces.dom.CoreDocumentImpl.importNode( Unknown
Source)
at org.apache.xerces.dom.CoreDocumentImpl.importNode( Unknown
Source)
at
uk.co.weft.domutil.MaybeParseGenerator.maybeParse( MaybeParseGenerator
..java:163)

I've checked the libraries and the two instances above use the same
versions of the same libraries with the same configuration, so why

<div class="Introduction"><p>Copies of documentation issued to licensees is
available in this section.</p></div>

parses successfully and

<div class="Introduction">
Ayrshire and Dumfrieshire Cyclists Association is a regional
association
of cycling clubs within the structure of Scottish Cycling.
</div>

fails to parse is frankly baffling me.

--
si***@jasmine.org.uk (Simon Brooke) http://www.jasmine.org.uk/~simon/
;; Let's have a moment of silence for all those Americans who are stuck
;; in traffic on their way to the gym to ride the stationary bicycle.
;; Rep. Earl Blumenauer (Dem, OR)

Mar 16 '07 #3
Just a quick observation: Your "sometimes works" and "sometimes doesn't"
are significantly different:
(Firstly, evidence that it sometimes does work):
ElementGenerator.generate: attempting to parse <div
class="Introduction"><p>Copies of documentation issued to licensees is
available in this section.</p></div>
<divhas a <pchild.

(Secondly, evidence that it sometimes doesn't):
ElementGenerator.generate: attempting to parse <div class="Introduction">
Ayrshire and Dumfrieshire Cyclists Association is a regional
association
of cycling clubs within the structure of Scottish Cycling.
</div>
<divcontains only text. Haven't looked at the code yet, but are you
sure you aren't doing something simple like trying to import the string
value rather than a TextNode object?

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Mar 16 '07 #4
Also: You didn't show us the implementation of DOMStub... but with that
name, I wouldn't be at all surprised if you've got a subset
implementation there.

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Mar 17 '07 #5
Well, I've reproduced the error message under Eclipse. Lemme see if I
can reproduce it with a current version of Xerces...

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Mar 17 '07 #6
Oh. That's stupid; I should have remembered this:

http://www.w3.org/TR/2000/REC-DOM-Le...ent-importNode

You're attempting to import a Document node. That's forbidden. Import
its root element instead.

Yes, the error message could have been more helpful. I'd suggest posting
that as a suggestion on the Xerces users mailing list, since I'm not
sure any of the current Xerces maintainers are reading this list.
--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Mar 17 '07 #7
* Joe Kesselman wrote in comp.text.xml:
>You're attempting to import a Document node. That's forbidden. Import
its root element instead.
Heh, I actually had a quick look into the Xerces source code when I
looked at the question, but that case was the only where the specific
claimed exception would be raised, and Simon said he tried to import
element nodes, so I concluded the issue is too weird to investigate
further...
--
Björn Höhrmann · mailto:bj****@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
Mar 17 '07 #8
in message <96******************************@comcast.com>, Joe Kesselman
('k*************@comcast.net') wrote:
Oh. That's stupid; I should have remembered this:

http://www.w3.org/TR/2000/REC-DOM-Le...ent-importNode
>
You're attempting to import a Document node. That's forbidden. Import
its root element instead.

Yes, the error message could have been more helpful. I'd suggest posting
that as a suggestion on the Xerces users mailing list, since I'm not
sure any of the current Xerces maintainers are reading this list.
Thank you. I was going to say indignantly 'oh no I don't', but on reading
through my code I see I get the root node of the document... and then
don't use it. Having fixed that, /this/ problem is solved, and I can now
replace vintage Crimson with current Xerces and my code still works.

Still can't get it to work with current Xalan, but that's another set of
problems...

--
si***@jasmine.org.uk (Simon Brooke) http://www.jasmine.org.uk/~simon/

;; Good grief, I can remember when England won the Ashes.
Mar 17 '07 #9
in message <H7******************************@comcast.com>, Joe Kesselman
('k*************@comcast.net') wrote:
Also: You didn't show us the implementation of DOMStub... but with that
name, I wouldn't be at all surprised if you've got a subset
implementation there.
No, it just allows me to select and configure the DOMImplementation I use:

/**
* Should be called before DOMStub is used, but perfectly safe to call
* more than once. If I've already been initialised, don't intialise me
* again.
*
* @param config my configuration
*
* @exception InitialisationException if requested DOM implementation
* can't be found
*/
public static void init( Context config ) throws InitialisationException
{
String s = config.getValueAsString( "dom_implementation_class" );

if ( domImp == null )
{
/* i.e., I have not already been initialised */
try
{
if ( s != null )
{
domImpName = s;
}

domImp =
(DOMImplementation) Class.forName( domImpName )
.newInstance( );
}
catch ( Exception any )
{
throw new InitialisationException( "Could not find DOM " +
"implementation " + domImpName );
}
}

Boolean b = config.getValueAsBoolean( "dom_coalescing" );

if ( b != null )
{
dbf.setCoalescing( b.booleanValue( ) );
}

b = config.getValueAsBoolean( "dom_expand_entity_references" );

if ( b != null )
{
dbf.setExpandEntityReferences( b.booleanValue( ) );
}

b = config.getValueAsBoolean( "dom_ignore_comments" );

if ( b != null )
{
dbf.setIgnoringComments( b.booleanValue( ) );
}

b = config.getValueAsBoolean( "dom_ignore_whitespace" );

if ( b != null )
{
dbf.setIgnoringElementContentWhitespace( b.booleanValue( ) );
}

b = config.getValueAsBoolean( "dom_namespace_aware" );

if ( b != null )
{
dbf.setNamespaceAware( b.booleanValue( ) );
}

b = config.getValueAsBoolean( "dom_validating" );

if ( b != null )
{
dbf.setValidating( b.booleanValue( ) );
}
}
}
--
si***@jasmine.org.uk (Simon Brooke) http://www.jasmine.org.uk/~simon/

X-no-archive: No, I'm not *that* naive.

Mar 17 '07 #10
Bjoern Hoehrmann wrote:
Simon said he tried to import element nodes, so I concluded the issue is too weird to
investigate further...
This is why it's often helpful to post a minimal example that provokes
the problem. In fact, the process of extracting code and writing that
reduced example is often enough to expose the problem.

I must admit I cheated -- I tossed the code into a debugger, did some
cleanup so it could actually be run, added the Xerces source (so I could
see what was happening inside that), set the classpaths to use this copy
of Xerces rather than the one in the Java libraries, set it to stop when
a DOMException was about to be thrown, and just pushed the button.
Bingo; there we are at the error, and the object in question is indeed a
Document.
--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Mar 17 '07 #11

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

8 posts views Thread by Daniel Frey | last post: by
4 posts views Thread by Dante | last post: by
1 post views Thread by Omkar Singh | last post: by
6 posts views Thread by J Williams | last post: by
1 post views Thread by ppcguy | last post: by
1 post views Thread by =?Utf-8?B?SmVzcGVyLCBEZW5tYXJr?= | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.