By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,984 Members | 1,086 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,984 IT Pros & Developers. It's quick & easy.

xerces/SAX xml search

P: n/a
I am currently working on coding something in c++ which allows me to
find locations (line/column) of certain elements and attributes within
an xml file. For this task, I am trying to create a SAX parser which
will search through an xml file as it parses. I have set up a simple
SAX parser and an empty handler.

I need a point in the right direction on where to begin with making
the parse search. I am not sure what is the best way to have the
parser take something like a string which would keep track of exactly
what i am looking for.

I am also unsure how to handle the search itself. Should I have two
parsers? (One which begins to find elements that fit the first
elements description and then parse that element to see if it contains
all the criteria/hierarchy of what i am looking for.)

Thanks in advance,
- Marc

Apr 25 '07 #1
Share this Question
Share on Google+
4 Replies


P: n/a
fo***********@gmail.com wrote:
I need a point in the right direction on where to begin with making
the parse search.
If you're doing it this way, you need to implement a SAX handler that
keeps track of what it's seen and whether that matches steps along the
way to whatever you're searching for.

Might make more sense to just use an off-the-shelf XPath/XSLT/XQuery
implementation, or subset implementation. XPath is the basic search
language for XML; XSLT and XQuery basically add functions and report
generation capability to that. The fully general versions of these do
require loading the entire document into memory, but subsets exist that
can be processed on the fly.

--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
Apr 25 '07 #2

P: n/a
Might make more sense to just use an off-the-shelf XPath/XSLT/XQuery
implementation, or subset implementation. XPath is the basic search
language for XML; XSLT and XQuery basically add functions and report
generation capability to that.
I understand the basics of XPath and XSLT and read about XQuery but
still do not understand how this will help in terms of creating
something that can search for specific elements and attributes I
provide. (even if I converted what I was looking for to an XPath
expression)

Can you explain more about 'off-the-shelf XPath/XSLT/XQuery
implementations'?

May 1 '07 #3

P: n/a
fo***********@gmail.com wrote:
I am currently working on coding something in c++ which allows me to
find locations (line/column) of certain elements and attributes within
an xml file.
OK, looking at this another time... You're almost certainly looking at
building your own SAX-based search, since you said you want line/column
information and most of the other APIs don't deliver that. (SAX may not
either, but you can at least try the SAXLocator API.)

Of course if you take that approach, it's entirely up to you to code the
logic that turns your search (however you want to express it) into a
state machine that can be driven by SAX events, or that runs over
whatever data structure you build from the SAX events to record the
document structure plus locator information (an annotated DOM, perhaps,
that adds location information... or some custom data structure tuned
for your own application's needs). Simple searches may not need much
stored state information; really complex ones may require the whole
document tree be available.

You've given us no indication of what kinds of searches you want to
perform, so generalities are all I can give you. You may be talking
about anything from a trivial subset of XPath to full XPath to full
XQuery to something more complicated than that. Obviously, simpler is
easier to implement.

Personal reaction: Line/column is usually a Bad Thing to use in the XML
world, because documents with identical semantics may not have the same
detailed syntax, and indeed tools don't always have that information
available to them. Expressing a point in the document as a simple XPath
to that location is often a better alternative.
--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
May 2 '07 #4

P: n/a
Joe Kesselman wrote:
fo***********@gmail.com wrote:
>I am currently working on coding something in c++ which allows me to
find locations (line/column) of certain elements and attributes within
an xml file.

OK, looking at this another time... You're almost certainly looking at
building your own SAX-based search, since you said you want line/column
information and most of the other APIs don't deliver that. (SAX may not
either, but you can at least try the SAXLocator API.)
Why not just use Expat ?

http://expat.sourceforge.net/

XML_GetCurrentLineNumber() and XML_GetCurrentColumnNumber() now return unsigned integers.
May 2 '07 #5

This discussion thread is closed

Replies have been disabled for this discussion.