473,396 Members | 1,755 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

Using Xerces SAX to parse just part of an input stream?

I'm trying to put together code to deal with a SOAP with attachements
response, and I'd like to process the response in a single pass. The
SOAP with attachments specification returns XML in a MIME message, so
it looks like this:

--4389012.48390
Content-Type: text/xml

<?xml version="1.0" encoding="UTF-8"?>
<soap-env:Envelope
xmlns:soap-env="http://schemas.xmlsoap.org/soap/envelope/">
....snip...
</soap-env:Envelope>
--4389012.48390
Content-Type: text/xml
Content-Id: RootNode

<?xml version="1.0" encoding="UTF-8"?><RootNode>
... snip ...
</RootNode>
--4389012.48390--

So what I'd LIKE to be able to do is to parse the incoming input stream
up to the <?xml> declaration, hand the input stream over to a SAX
parser, let it parse to the end of the document, and then have it
return at the end so I can continue parsing the same input stream.

The problem is that "SAXParser.parse( new InputSource( inputStream ),
handler );" appears to want to consume the input stream until it
reaches EOF on the input stream (which, when given the input stream
above, fails with the error message "Content is not allowed in trailing
section."). Is this something I can work around in Xerces, or is there
a better SAX implementation that will let me tell the parser to stop
when it reaches the last element?

May 10 '06 #1
3 3102
Nobody wrote:
The problem is that "SAXParser.parse( new InputSource( inputStream ),
handler );" appears to want to consume the input stream until it
reaches EOF on the input stream (which, when given the input stream
above, fails with the error message "Content is not allowed in trailing
section.").


Unfortunately, the definition of XML parsing does say that there
shouldn't be anything following the document element.

Possible solution: Create a stream filter which you pass the
"--4389012.48390" at the start of the enclosed message, and which
delivers characters only until it sees the corresponding
"--4389012.48390" mark at the end, returning EOF thereafter. Run the
parser from that filter-stream rather than direct from your original
input stream.

In other words, sweep the issue under the carpet so the parser doesn't
have to see it.
May 10 '06 #2
Thanks - that was pretty much what I've come up with, although I was
hoping for something simpler. Of course, it doesn't look like writing
a SAX parser is all THAT hard...

May 10 '06 #3
Nobody wrote:
Thanks - that was pretty much what I've come up with, although I was
hoping for something simpler. Of course, it doesn't look like writing
a SAX parser is all THAT hard...


XML 1.0 was designed with the goal that writing a parser should be about
the right size for a student project.

Of course that's before namespaces, and schemas, and other things were
added to the mix.

Experience has shown that this is very much a 90/10 problem. You can get
90% of the behavior for 10% of the effort; the other 10% takes the other
90% (or more) of the effort. And making it perform well can add yet
another 90%...
--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
May 10 '06 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: bugbear | last post by:
Subject pretty much says it all. I'd like to parse XML (duh!) using Xerces (because its fast, and reliable, and comprehensive, and supports lots of features). I'd like to conform to standards...
7
by: Ganesh Gella | last post by:
Hi All, I am planning to use Xalan to transform XML data by applying xls stylesheets. Here tricky part is, Xalan provides several C++ APIs, which are very much useful if our requirement is...
2
by: Cigdem | last post by:
Hello, I am trying to parse the XML files that the user selects(XML files are on anoher OS400 system called "wkdis3"). But i am permenantly getting that error: Directory0: \\wkdis3\ROOT\home...
3
by: Girish | last post by:
Hi All, I have written a component(ATL COM) that wraps Xerces C++ parser. I am firing necessary events for each of the notifications that I have handled for the Content and Error handler. The...
12
by: BGP | last post by:
I am working on a WIN32 API app using devc++4992 that will accept Dow Jones/NASDAQ/etc. stock prices as input, parse them, and do things with it. The user can just cut and paste back prices into a...
0
by: atlantis | last post by:
Hi, I have a very strange problem with xsl:import when usig RELATIVE path on AIX 5.2 server. I have two XSL files in the same directory: "ists_xslt3.xsl" and "ists_xslt3_layout.xsl". This...
8
by: FS Liu | last post by:
Hi, I am writing ATL Service application (XML Web service) in VS.NET C++. Are there any sample programs that accept XML as input and XML as output in the web service? Thank you very much.
18
by: jacksu | last post by:
I have a simple program to run xpath with xerces 1_2_7 XPathFactory factory = XPathFactory.newInstance(); XPath xPath = factory.newXPath(); XPathExpression xp = xPath.compile(strXpr);...
1
by: Sidhartha | last post by:
Hi, I am facing a problem while parsing local language characters using sax parser. We use DOM to parse and SAX to read the source. But when our application parses strings with local language...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.