Tree splitting/merging - .NET Framework

William Ahern

I'm looking for resources on splitting and merging XML trees. Specifically,
on methods to pare large XML documents into smaller documents which can be
merged later.

Off of the top of my head, I can envision unions of node sets, and unions of
node text. But I know there's much more to the subject than that, if not
more alternatives than greater technical detail.

TIA,

Bill

Jul 20 '05 #1

Subscribe Reply

2525

sylvain.loiseau

> I'm looking for resources on splitting and merging XML trees.
Specifically,

on methods to pare large XML documents into smaller documents which can be
merged later.
I have something for a problem (perhaps) close to yours: I need to perform
XSLT transformation on very large document which doesn't fit in memory. I
use a SAX parser with three XMLFilter (concretely, sub-classes of
org.xml.sax.helpers.XMLFilterImpl). The first class "split" the stream (i.e.
it throw a "start document" and a "end document" events) when it encouters a
specific start and endElement. So the next filter receive several (smaller)
documents one at once. This second filter is a TransformerHandler which
perform the transformation. Then it pass the event to a last filter, a
"merger", who discard the "start" and "endDocument" event except the very
first and the very last one.
I was inspired by a Perl module by Barrie Slaymaker.
(inccidentaly, I noticed that there is nothing as convenient for Java that
the XML::SAX::Pipeline Perl module)

In fact I was coming on this list for a question close to this one: it's in
a new thread...
Off of the top of my head, I can envision unions of node sets, and unions of node text. But I know there's much more to the subject than that, if not
more alternatives than greater technical detail.
Which level of well-formedness have your merging problem, i.e. do you want
only add node to existing nodes in a DOM mode (you just need standard method
of the Node interface), or do you want to insert mixed content checking for
well-formedness, tag nesting, etc?
TIA,

Jul 20 '05 #2

William Ahern

sylvain.loiseau <sy*************@wanadoo.fr> wrote:

I'm looking for resources on splitting and merging XML trees.

Specifically,
on methods to pare large XML documents into smaller documents which can be
merged later.

I have something for a problem (perhaps) close to yours: I need to perform
XSLT transformation on very large document which doesn't fit in memory. I
use a SAX parser with three XMLFilter (concretely, sub-classes of
org.xml.sax.helpers.XMLFilterImpl). The first class "split" the stream (i.e.
it throw a "start document" and a "end document" events) when it encouters a
specific start and endElement. So the next filter receive several (smaller)
documents one at once. This second filter is a TransformerHandler which
perform the transformation. Then it pass the event to a last filter, a
"merger", who discard the "start" and "endDocument" event except the very
first and the very last one.
I was inspired by a Perl module by Barrie Slaymaker.
(inccidentaly, I noticed that there is nothing as convenient for Java that
the XML::SAX::Pipeline Perl module)

Right after posting I tripped over the XPipe project (http://xpipe.sf.net/).
XPipe associates this w/ the scatter/gather pattern, and they seem to have
put a lot of thought into the issues. Specifically, they elaborate on a
notion of a "fulcra", or the node-depth I suppose you could call it, that a
document can be split on. Probably you're already thought this through, but
maybe you can find more info on that site. They have code and list
discussions you can wade through.

- Bill

Jul 20 '05 #3

sylvain.loiseau

Thanks, it looks very interesting.

Sylvain

"William Ahern" <wi*****@wilbur.25thandClement.com> a écrit dans le message
de news: g4************@wilbur.25thandClement.com...

sylvain.loiseau <sy*************@wanadoo.fr> wrote:
I'm looking for resources on splitting and merging XML trees. Specifically,
on methods to pare large XML documents into smaller documents which can be merged later.

I have something for a problem (perhaps) close to yours: I need to perform XSLT transformation on very large document which doesn't fit in memory. I use a SAX parser with three XMLFilter (concretely, sub-classes of
org.xml.sax.helpers.XMLFilterImpl). The first class "split" the stream (i.e. it throw a "start document" and a "end document" events) when it encouters a specific start and endElement. So the next filter receive several (smaller) documents one at once. This second filter is a TransformerHandler which
perform the transformation. Then it pass the event to a last filter, a
"merger", who discard the "start" and "endDocument" event except the very first and the very last one.
I was inspired by a Perl module by Barrie Slaymaker.
(inccidentaly, I noticed that there is nothing as convenient for Java that the XML::SAX::Pipeline Perl module)

Right after posting I tripped over the XPipe project

(http://xpipe.sf.net/). XPipe associates this w/ the scatter/gather pattern, and they seem to have
put a lot of thought into the issues. Specifically, they elaborate on a
notion of a "fulcra", or the node-depth I suppose you could call it, that a document can be split on. Probably you're already thought this through, but maybe you can find more info on that site. They have code and list
discussions you can wade through.

- Bill

Jul 20 '05 #4

Similar topics

2285

stack values in xsl or substract element from tree

by: Jean-Christophe Michel | last post by:

Hi, In a complex merging of two (non ordered) xml files i need to keep track of the elements of the second tree that were already merged with first tree, to copy only unused elements at the end....

.NET Framework

5935

Database design for tree structured data

by: Will Honea | last post by:

I have a data set which I need to analyze but I am having a problem figuring out a structure for the database - or whether there are better ways of attacking the problem. The base data set is a...

DB2 Database

1225

Merging of two branches into a complete tree.

by: oaunay0275027523057odueod | last post by:

Say I have branchA: <tree id = "1"> <item id = "10"> <item id = "100"/> </item> </tree> , and branchB:

XML

2870

Splitting multiple cells in a Table

by: chris f | last post by:

I'm dynamically populating a Table control in ASP.NET 2. Each row has 4 columns but column #3 needs to be split into 3 rows and column #4 needs to be split into 4 rows. Each of these cells contains...

ASP.NET

7067

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

7264

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

7449

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

5562

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

4666

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...

C# / C Sharp

3160

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...

Networking - Hardware / Configuration

3148

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET

1495

transfer the data from one system to another through ip address

by: 6302768590 | last post by:

Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...

C# / C Sharp

728

How to add payments to a PHP MySQL app.

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP