By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
446,368 Members | 1,294 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 446,368 IT Pros & Developers. It's quick & easy.

parsing XML files with SAX

P: n/a
hi,

I've been looking through the various XML parsers API available and I have
decided to use the SAX parser. Probably not the best of choices but I think
it can do the job. What is the best way to parse an XML file using the SAX
parser ? I have seen examples where they store each element tag in java bean
classes. I am not sure this is a good way for my XML file which looks like
this:

<parent>
<node1>
<child1>AAA</child1>
<grandchild1>BBB</grandchild1>
<grandchild2>
<anything>CCC</anything>
</grandchild2>
<child2>DDD<</child2>
<child3>DDD<</child3>
</node1>
<node2>
<child1>AAA<</child1>
<grandchild1>BBB</grandchild1>
<grandchild2>
<anything>CCC</anything>
</grandchild2>
<child2>DDD<</child2>
<child3>DDD<</child3>
</node2>
</parent>

I have to get the value of the tag "anything" in node1, node2 etc ..., store
the value of child3 in a database etc ...

Does anyone have any experience or advices regarding the fastest way to do
that using SAX (or any other parser) ?

Thanks !
Jul 23 '05 #1
Share this Question
Share on Google+
7 Replies


P: n/a
mike henkins <dd@nospam.com> wrote:
hi,

I've been looking through the various XML parsers API available and I have
decided to use the SAX parser. Probably not the best of choices but I think
it can do the job. What is the best way to parse an XML file using the SAX
parser ? I have seen examples where they store each element tag in java bean
classes. I am not sure this is a good way for my XML file which looks like
this:

<parent>
<node1>
<child1>AAA</child1>
<grandchild1>BBB</grandchild1>
<grandchild2>
<anything>CCC</anything>
</grandchild2>
<child2>DDD<</child2>
<child3>DDD<</child3>
</node1>
<node2>
<child1>AAA<</child1>
<grandchild1>BBB</grandchild1>
<grandchild2>
<anything>CCC</anything>
</grandchild2>
<child2>DDD<</child2>
<child3>DDD<</child3>
</node2>
</parent>

I have to get the value of the tag "anything" in node1, node2 etc ..., store
the value of child3 in a database etc ...

Does anyone have any experience or advices regarding the fastest way to do
that using SAX (or any other parser) ?


Try
http://home.eol.ca/~parkw/index.html#expat
which is shell interface to Expat XML parser.

--
William Park <op**********@yahoo.ca>, Toronto, Canada
ThinFlash: Linux thin-client on USB key (flash) drive
http://home.eol.ca/~parkw/thinflash.html
BashDiff: Super Bash shell
http://freshmeat.net/projects/bashdiff/
Jul 23 '05 #2

P: n/a
I would personally prefer DOM parsing in this case. DOM gives us a neat
object oriented method to read elements and attributes, as well as
modify the document tree.

With SAX approach, I'll have to set up the whole parser call back
infrastructure in my application just to read a single element node,
which somehow does'nt appeal to me!

Another advantage with DOM is, that I can easily store element and
attribute properties in Java beans or in other kind of container
objects easily. With SAX I'll find that difficult to do.

I'll prefer SAX, if I have to ready the whole (or nearly whole)
document serially in one pass.

Regards,
Mukul

Jul 24 '05 #3

P: n/a
Mukul Gandhi wrote:
With SAX approach, I'll have to set up the whole parser call back
infrastructure in my application just to read a single element node,
which somehow does'nt appeal to me!


Is it really so complicated to set up "the whole parser call back
infrastructure" ? Even in Java this should not be much more text
than a comparable DOM solution.

Besides Java, there are scripting languages based upon
the SAX approach. In these languages, reading a single element node
can be done with a one-line script. The larger your file is,
the greater the speed advantage of a SAX-based script.
Jul 24 '05 #4

P: n/a
Thanks for telling more about SAX. Which scripting languages have SAX
bindings? Can you please provide some references?

Regards,
Mukul

Jul 24 '05 #5

P: n/a
Mukul Gandhi wrote:
Thanks for telling more about SAX. Which scripting languages have SAX
bindings? Can you please provide some references?


GNU Awk and bash have XML extensions which are not
yet merged into the official source code:

http://home.vrweb.de/~juergen.kahrs/gawk/XML/
http://home.eol.ca/~parkw/index.html#expat

Perl is probably the script language that has
the longest tradition of supporting XML files.
Python, Ruby etc. also have some kind of XML
support. Recently, there has been an ECMA proposal
for extending JavaScript with functions for processing
XML data. Use Google to find out more.
Jul 24 '05 #6

P: n/a
Seems you have done good work with gawk XML. Very nice.

Regards,
Mukul

Jürgen Kahrs wrote:
GNU Awk and bash have XML extensions which are not
yet merged into the official source code:

http://home.vrweb.de/~juergen.kahrs/gawk/XML/
http://home.eol.ca/~parkw/index.html#expat

Perl is probably the script language that has
the longest tradition of supporting XML files.
Python, Ruby etc. also have some kind of XML
support. Recently, there has been an ECMA proposal
for extending JavaScript with functions for processing
XML data. Use Google to find out more.


Jul 25 '05 #7

P: n/a
http://vtd-xml.sf.net
"mike henkins" <dd@nospam.com> wrote in message
news:42*********************@news.wanadoo.fr...
hi,

I've been looking through the various XML parsers API available and I have
decided to use the SAX parser. Probably not the best of choices but I
think
it can do the job. What is the best way to parse an XML file using the SAX
parser ? I have seen examples where they store each element tag in java
bean
classes. I am not sure this is a good way for my XML file which looks like
this:

<parent>
<node1>
<child1>AAA</child1>
<grandchild1>BBB</grandchild1>
<grandchild2>
<anything>CCC</anything>
</grandchild2>
<child2>DDD<</child2>
<child3>DDD<</child3>
</node1>
<node2>
<child1>AAA<</child1>
<grandchild1>BBB</grandchild1>
<grandchild2>
<anything>CCC</anything>
</grandchild2>
<child2>DDD<</child2>
<child3>DDD<</child3>
</node2>
</parent>

I have to get the value of the tag "anything" in node1, node2 etc ...,
store
the value of child3 in a database etc ...

Does anyone have any experience or advices regarding the fastest way to do
that using SAX (or any other parser) ?

Thanks !

Jul 27 '05 #8

This discussion thread is closed

Replies have been disabled for this discussion.