473,396 Members | 1,836 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

xerces/SAX xml search

I am currently working on coding something in c++ which allows me to
find locations (line/column) of certain elements and attributes within
an xml file. For this task, I am trying to create a SAX parser which
will search through an xml file as it parses. I have set up a simple
SAX parser and an empty handler.

I need a point in the right direction on where to begin with making
the parse search. I am not sure what is the best way to have the
parser take something like a string which would keep track of exactly
what i am looking for.

I am also unsure how to handle the search itself. Should I have two
parsers? (One which begins to find elements that fit the first
elements description and then parse that element to see if it contains
all the criteria/hierarchy of what i am looking for.)

Thanks in advance,
- Marc

Apr 25 '07 #1
4 2069
fo***********@gmail.com wrote:
I need a point in the right direction on where to begin with making
the parse search.
If you're doing it this way, you need to implement a SAX handler that
keeps track of what it's seen and whether that matches steps along the
way to whatever you're searching for.

Might make more sense to just use an off-the-shelf XPath/XSLT/XQuery
implementation, or subset implementation. XPath is the basic search
language for XML; XSLT and XQuery basically add functions and report
generation capability to that. The fully general versions of these do
require loading the entire document into memory, but subsets exist that
can be processed on the fly.

--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
Apr 25 '07 #2
Might make more sense to just use an off-the-shelf XPath/XSLT/XQuery
implementation, or subset implementation. XPath is the basic search
language for XML; XSLT and XQuery basically add functions and report
generation capability to that.
I understand the basics of XPath and XSLT and read about XQuery but
still do not understand how this will help in terms of creating
something that can search for specific elements and attributes I
provide. (even if I converted what I was looking for to an XPath
expression)

Can you explain more about 'off-the-shelf XPath/XSLT/XQuery
implementations'?

May 1 '07 #3
fo***********@gmail.com wrote:
I am currently working on coding something in c++ which allows me to
find locations (line/column) of certain elements and attributes within
an xml file.
OK, looking at this another time... You're almost certainly looking at
building your own SAX-based search, since you said you want line/column
information and most of the other APIs don't deliver that. (SAX may not
either, but you can at least try the SAXLocator API.)

Of course if you take that approach, it's entirely up to you to code the
logic that turns your search (however you want to express it) into a
state machine that can be driven by SAX events, or that runs over
whatever data structure you build from the SAX events to record the
document structure plus locator information (an annotated DOM, perhaps,
that adds location information... or some custom data structure tuned
for your own application's needs). Simple searches may not need much
stored state information; really complex ones may require the whole
document tree be available.

You've given us no indication of what kinds of searches you want to
perform, so generalities are all I can give you. You may be talking
about anything from a trivial subset of XPath to full XPath to full
XQuery to something more complicated than that. Obviously, simpler is
easier to implement.

Personal reaction: Line/column is usually a Bad Thing to use in the XML
world, because documents with identical semantics may not have the same
detailed syntax, and indeed tools don't always have that information
available to them. Expressing a point in the document as a simple XPath
to that location is often a better alternative.
--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
May 2 '07 #4
Joe Kesselman wrote:
fo***********@gmail.com wrote:
>I am currently working on coding something in c++ which allows me to
find locations (line/column) of certain elements and attributes within
an xml file.

OK, looking at this another time... You're almost certainly looking at
building your own SAX-based search, since you said you want line/column
information and most of the other APIs don't deliver that. (SAX may not
either, but you can at least try the SAXLocator API.)
Why not just use Expat ?

http://expat.sourceforge.net/

XML_GetCurrentLineNumber() and XML_GetCurrentColumnNumber() now return unsigned integers.
May 2 '07 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Bekkali Hicham | last post by:
hi, i have downloaded the latest version 2.4 of Xerces, and unziped it, i end up with a diectory hierarchy like this c:\xerces-2_4_0\XercesImpl.jar c:\xerces-2_4_0\XercesSamples.jar...
0
by: Waseem | last post by:
Hi I have looked and tried everything and i still cant sort this out i have no idea why this wont work I am using Xerces Perl on Windows and Debian to try this and it wont work on both of...
0
by: Jim Phelps | last post by:
After having memory leak issues with Xerces-c 2.3.0 for Solaris 2.7 for CC 6.2 I have decided to update to at least 2.4. I have downloaded the binary tarball and have installed it on my...
0
by: Dale Gerdemann | last post by:
I've been trying to use DOM level 3 with xerces-2_6_2. There's a sample called samples/DOM3.java, but I've had trouble with compilation. I've downloaded Xerces-J-bin.2.6.2 and...
1
by: Peter Saffrey | last post by:
I am hoping to use the Xerces libraries to read and process XML files for my applications. What I want is to parse some XML and extract information from particular tags, something which should be...
18
by: jacksu | last post by:
I have a simple program to run xpath with xerces 1_2_7 XPathFactory factory = XPathFactory.newInstance(); XPath xPath = factory.newXPath(); XPathExpression xp = xPath.compile(strXpr);...
9
by: anupamjain | last post by:
Hi, After 2 weeks of search/hit-and-trial I finally thought to revert to the group to find solution to my problem.(something I should have done much earlier) This is the deal : On a JSP...
2
by: Vlad Zorinov | last post by:
I'm getting the following error after a couple of months of XML processing, using Xerces 2.0.0 in an apache tomcat. Does anyone have any ideas what this problem may be or what I should do to solve...
3
by: Raphael Tagliani | last post by:
(english version below) Bonjour! Je travaille sur un gros projet java, qui parse beaucoup de fichiers xml au lancement d'un serveur. Nous avons un problème de concurrence qu lancement. En...
9
by: mstilli | last post by:
Hi, I am trying to use schema for server side validation using xerces to catch the validation errors. validating this XML: <Content4> <textarea13></textarea13>...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.