By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,665 Members | 1,251 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,665 IT Pros & Developers. It's quick & easy.

Tree splitting/merging

P: n/a
I'm looking for resources on splitting and merging XML trees. Specifically,
on methods to pare large XML documents into smaller documents which can be
merged later.

Off of the top of my head, I can envision unions of node sets, and unions of
node text. But I know there's much more to the subject than that, if not
more alternatives than greater technical detail.

TIA,

Bill
Jul 20 '05 #1
Share this Question
Share on Google+
3 Replies


P: n/a
> I'm looking for resources on splitting and merging XML trees.
Specifically,
on methods to pare large XML documents into smaller documents which can be
merged later.
I have something for a problem (perhaps) close to yours: I need to perform
XSLT transformation on very large document which doesn't fit in memory. I
use a SAX parser with three XMLFilter (concretely, sub-classes of
org.xml.sax.helpers.XMLFilterImpl). The first class "split" the stream (i.e.
it throw a "start document" and a "end document" events) when it encouters a
specific start and endElement. So the next filter receive several (smaller)
documents one at once. This second filter is a TransformerHandler which
perform the transformation. Then it pass the event to a last filter, a
"merger", who discard the "start" and "endDocument" event except the very
first and the very last one.
I was inspired by a Perl module by Barrie Slaymaker.
(inccidentaly, I noticed that there is nothing as convenient for Java that
the XML::SAX::Pipeline Perl module)

In fact I was coming on this list for a question close to this one: it's in
a new thread...
Off of the top of my head, I can envision unions of node sets, and unions of node text. But I know there's much more to the subject than that, if not
more alternatives than greater technical detail.
Which level of well-formedness have your merging problem, i.e. do you want
only add node to existing nodes in a DOM mode (you just need standard method
of the Node interface), or do you want to insert mixed content checking for
well-formedness, tag nesting, etc?
TIA,


Jul 20 '05 #2

P: n/a
sylvain.loiseau <sy*************@wanadoo.fr> wrote:
I'm looking for resources on splitting and merging XML trees.

Specifically,
on methods to pare large XML documents into smaller documents which can be
merged later.


I have something for a problem (perhaps) close to yours: I need to perform
XSLT transformation on very large document which doesn't fit in memory. I
use a SAX parser with three XMLFilter (concretely, sub-classes of
org.xml.sax.helpers.XMLFilterImpl). The first class "split" the stream (i.e.
it throw a "start document" and a "end document" events) when it encouters a
specific start and endElement. So the next filter receive several (smaller)
documents one at once. This second filter is a TransformerHandler which
perform the transformation. Then it pass the event to a last filter, a
"merger", who discard the "start" and "endDocument" event except the very
first and the very last one.
I was inspired by a Perl module by Barrie Slaymaker.
(inccidentaly, I noticed that there is nothing as convenient for Java that
the XML::SAX::Pipeline Perl module)


Right after posting I tripped over the XPipe project (http://xpipe.sf.net/).
XPipe associates this w/ the scatter/gather pattern, and they seem to have
put a lot of thought into the issues. Specifically, they elaborate on a
notion of a "fulcra", or the node-depth I suppose you could call it, that a
document can be split on. Probably you're already thought this through, but
maybe you can find more info on that site. They have code and list
discussions you can wade through.

- Bill
Jul 20 '05 #3

P: n/a
Thanks, it looks very interesting.

Sylvain

"William Ahern" <wi*****@wilbur.25thandClement.com> a écrit dans le message
de news: g4************@wilbur.25thandClement.com...
sylvain.loiseau <sy*************@wanadoo.fr> wrote:
I'm looking for resources on splitting and merging XML trees. Specifically,
on methods to pare large XML documents into smaller documents which can be merged later.


I have something for a problem (perhaps) close to yours: I need to perform XSLT transformation on very large document which doesn't fit in memory. I use a SAX parser with three XMLFilter (concretely, sub-classes of
org.xml.sax.helpers.XMLFilterImpl). The first class "split" the stream (i.e. it throw a "start document" and a "end document" events) when it encouters a specific start and endElement. So the next filter receive several (smaller) documents one at once. This second filter is a TransformerHandler which
perform the transformation. Then it pass the event to a last filter, a
"merger", who discard the "start" and "endDocument" event except the very first and the very last one.
I was inspired by a Perl module by Barrie Slaymaker.
(inccidentaly, I noticed that there is nothing as convenient for Java that the XML::SAX::Pipeline Perl module)


Right after posting I tripped over the XPipe project

(http://xpipe.sf.net/). XPipe associates this w/ the scatter/gather pattern, and they seem to have
put a lot of thought into the issues. Specifically, they elaborate on a
notion of a "fulcra", or the node-depth I suppose you could call it, that a document can be split on. Probably you're already thought this through, but maybe you can find more info on that site. They have code and list
discussions you can wade through.

- Bill

Jul 20 '05 #4

This discussion thread is closed

Replies have been disabled for this discussion.