Martin Honnen wrote:
Mozilla's XML parser is known to create a single text node for each
chunk of 4096 characters
If so, they're doing some slightly sloppy buffer management.
SAX is expected to divide text up wherever's convenient for the parser's
input buffers, since SAX was theoretically intended to be a thin layer
between the parser and application. (It should be even thinner, but it's
a bit late to argue about that now.)
But DOM Level 1 Core's description of Text nodes, reiterated in Level 2
and Level 3, says "When a document is first made available via the DOM,
there is only one Text node for each block of text." So it's surprising,
disappointing, and annoying that Mozilla isn't honoring that expectation.
Ignoring this requirement may be giving them a bit of a performance
boost. And well-written DOM code should be able to deal with it,
especially since parsers which retain CDATA-section boundaries will have
intermixed Text and CDATASection nodes which can cause similar hassles.
So if they wanted to offer this as an _optional_ mode, I wouldn't
complain... But if they aren't defaulting to delivering the document in
single-node-per-text-block form, they really aren't fully conforming to
the spec.
--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry