473,383 Members | 1,813 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,383 software developers and data experts.

parsing XML files with SAX

hi,

I've been looking through the various XML parsers API available and I have
decided to use the SAX parser. Probably not the best of choices but I think
it can do the job. What is the best way to parse an XML file using the SAX
parser ? I have seen examples where they store each element tag in java bean
classes. I am not sure this is a good way for my XML file which looks like
this:

<parent>
<node1>
<child1>AAA</child1>
<grandchild1>BBB</grandchild1>
<grandchild2>
<anything>CCC</anything>
</grandchild2>
<child2>DDD<</child2>
<child3>DDD<</child3>
</node1>
<node2>
<child1>AAA<</child1>
<grandchild1>BBB</grandchild1>
<grandchild2>
<anything>CCC</anything>
</grandchild2>
<child2>DDD<</child2>
<child3>DDD<</child3>
</node2>
</parent>

I have to get the value of the tag "anything" in node1, node2 etc ..., store
the value of child3 in a database etc ...

Does anyone have any experience or advices regarding the fastest way to do
that using SAX (or any other parser) ?

Thanks !
Jul 23 '05 #1
7 1428
mike henkins <dd@nospam.com> wrote:
hi,

I've been looking through the various XML parsers API available and I have
decided to use the SAX parser. Probably not the best of choices but I think
it can do the job. What is the best way to parse an XML file using the SAX
parser ? I have seen examples where they store each element tag in java bean
classes. I am not sure this is a good way for my XML file which looks like
this:

<parent>
<node1>
<child1>AAA</child1>
<grandchild1>BBB</grandchild1>
<grandchild2>
<anything>CCC</anything>
</grandchild2>
<child2>DDD<</child2>
<child3>DDD<</child3>
</node1>
<node2>
<child1>AAA<</child1>
<grandchild1>BBB</grandchild1>
<grandchild2>
<anything>CCC</anything>
</grandchild2>
<child2>DDD<</child2>
<child3>DDD<</child3>
</node2>
</parent>

I have to get the value of the tag "anything" in node1, node2 etc ..., store
the value of child3 in a database etc ...

Does anyone have any experience or advices regarding the fastest way to do
that using SAX (or any other parser) ?


Try
http://home.eol.ca/~parkw/index.html#expat
which is shell interface to Expat XML parser.

--
William Park <op**********@yahoo.ca>, Toronto, Canada
ThinFlash: Linux thin-client on USB key (flash) drive
http://home.eol.ca/~parkw/thinflash.html
BashDiff: Super Bash shell
http://freshmeat.net/projects/bashdiff/
Jul 23 '05 #2
I would personally prefer DOM parsing in this case. DOM gives us a neat
object oriented method to read elements and attributes, as well as
modify the document tree.

With SAX approach, I'll have to set up the whole parser call back
infrastructure in my application just to read a single element node,
which somehow does'nt appeal to me!

Another advantage with DOM is, that I can easily store element and
attribute properties in Java beans or in other kind of container
objects easily. With SAX I'll find that difficult to do.

I'll prefer SAX, if I have to ready the whole (or nearly whole)
document serially in one pass.

Regards,
Mukul

Jul 24 '05 #3
Mukul Gandhi wrote:
With SAX approach, I'll have to set up the whole parser call back
infrastructure in my application just to read a single element node,
which somehow does'nt appeal to me!


Is it really so complicated to set up "the whole parser call back
infrastructure" ? Even in Java this should not be much more text
than a comparable DOM solution.

Besides Java, there are scripting languages based upon
the SAX approach. In these languages, reading a single element node
can be done with a one-line script. The larger your file is,
the greater the speed advantage of a SAX-based script.
Jul 24 '05 #4
Thanks for telling more about SAX. Which scripting languages have SAX
bindings? Can you please provide some references?

Regards,
Mukul

Jul 24 '05 #5
Mukul Gandhi wrote:
Thanks for telling more about SAX. Which scripting languages have SAX
bindings? Can you please provide some references?


GNU Awk and bash have XML extensions which are not
yet merged into the official source code:

http://home.vrweb.de/~juergen.kahrs/gawk/XML/
http://home.eol.ca/~parkw/index.html#expat

Perl is probably the script language that has
the longest tradition of supporting XML files.
Python, Ruby etc. also have some kind of XML
support. Recently, there has been an ECMA proposal
for extending JavaScript with functions for processing
XML data. Use Google to find out more.
Jul 24 '05 #6
Seems you have done good work with gawk XML. Very nice.

Regards,
Mukul

Jürgen Kahrs wrote:
GNU Awk and bash have XML extensions which are not
yet merged into the official source code:

http://home.vrweb.de/~juergen.kahrs/gawk/XML/
http://home.eol.ca/~parkw/index.html#expat

Perl is probably the script language that has
the longest tradition of supporting XML files.
Python, Ruby etc. also have some kind of XML
support. Recently, there has been an ECMA proposal
for extending JavaScript with functions for processing
XML data. Use Google to find out more.


Jul 25 '05 #7
http://vtd-xml.sf.net
"mike henkins" <dd@nospam.com> wrote in message
news:42*********************@news.wanadoo.fr...
hi,

I've been looking through the various XML parsers API available and I have
decided to use the SAX parser. Probably not the best of choices but I
think
it can do the job. What is the best way to parse an XML file using the SAX
parser ? I have seen examples where they store each element tag in java
bean
classes. I am not sure this is a good way for my XML file which looks like
this:

<parent>
<node1>
<child1>AAA</child1>
<grandchild1>BBB</grandchild1>
<grandchild2>
<anything>CCC</anything>
</grandchild2>
<child2>DDD<</child2>
<child3>DDD<</child3>
</node1>
<node2>
<child1>AAA<</child1>
<grandchild1>BBB</grandchild1>
<grandchild2>
<anything>CCC</anything>
</grandchild2>
<child2>DDD<</child2>
<child3>DDD<</child3>
</node2>
</parent>

I have to get the value of the tag "anything" in node1, node2 etc ...,
store
the value of child3 in a database etc ...

Does anyone have any experience or advices regarding the fastest way to do
that using SAX (or any other parser) ?

Thanks !

Jul 27 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
by: Gerrit Holl | last post by:
Posted with permission from the author. I have some comments on this PEP, see the (coming) followup to this message. PEP: 321 Title: Date/Time Parsing and Formatting Version: $Revision: 1.3 $...
4
by: Marian Jancar | last post by:
Hi, Is there a module for parsing spec files available? Marian -- -- Best Regards,
0
by: rick_muller | last post by:
I'm interested in parsing a (simple) Makefile using Python. I need to write a packager for a program I'm supporting, and would like to read the list of files in the makefile so that I only have to...
2
by: Cigdem | last post by:
Hello, I am trying to parse the XML files that the user selects(XML files are on anoher OS400 system called "wkdis3"). But i am permenantly getting that error: Directory0: \\wkdis3\ROOT\home...
3
by: Girish | last post by:
Hi All, I have written a component(ATL COM) that wraps Xerces C++ parser. I am firing necessary events for each of the notifications that I have handled for the Content and Error handler. The...
9
by: PedroX | last post by:
Hello: I need to parse some large XML files, and save the data in an Access DB. I was using MSXML 2 and ASP, but it turns out to be extremely slow when then XML documents are like 10 mb in...
35
by: .:mmac:. | last post by:
I have a bunch of files (Playlist files for media player) and I am trying to create an automatically generated web page that includes the last 20 or 30 of these files. The files are created every...
1
by: Christoph Bisping | last post by:
Hello! Maybe someone is able to give me a little hint on this: I've written a vb.net app which is mainly an interpreter for specialized CAD/CAM files. These files mainly contain simple movement...
3
by: toton | last post by:
Hi, I have some ascii files, which are having some formatted text. I want to read some section only from the total file. For that what I am doing is indexing the sections (denoted by .START in...
3
by: GazK | last post by:
I have been using an xml parsing script to parse a number of rss feeds and return relevant results to a database. The script has worked well for a couple of years, despite having very crude...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.