473,473 Members | 1,818 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

c++ parsing with mix of sax & dom for large files

Hello

I'm not familiar with xerces in c++

Currently, we parse xml file with perl (typically XML::Twig) and java
(dom4j).
With both API, there is a very comfortable way to mix Sax/DOM, by
setting handlers to some elements paths.

The xml file is parsed, then once a defined paths is reached, the
element is considered and given to a handler subroutines.
All the subtree can be explored with domlike call (xpath etc.) as a
memory stored element.
Then, the tree can be purged, thus the memory released

It's a quite convenient merge, to get the best of two worlds.

Is ithat possible with xerces in c++???
I cannot find any simple answer in apache doc

thanks
Alex

Jan 10 '07 #1
2 2800
alex masselot wrote:
Hello

I'm not familiar with xerces in c++

Currently, we parse xml file with perl (typically XML::Twig) and java
(dom4j).
With both API, there is a very comfortable way to mix Sax/DOM, by
setting handlers to some elements paths.

The xml file is parsed, then once a defined paths is reached, the
element is considered and given to a handler subroutines.
All the subtree can be explored with domlike call (xpath etc.) as a
memory stored element.
Then, the tree can be purged, thus the memory released
It's a job for Active Tags and the XML Control Language !

XCL pipelines are working in the same way in RefleX (the engine) ;
however, you can also use XPath directly on SAX streams :
you can define XPath patterns for filtering (like with XSLT) except that
large files are supported as well

additionally, you can "cast" a tree or a subtree from DOM to SAX or SAX
to DOM at will

here are some examples :
http://reflex.gforge.inria.fr/saxPatterns.html#N802B53
http://reflex.gforge.inria.fr/tutorial.html#N801C30

and the slides that were shown at <XML2006in Boston :
http://disc.inria.fr/perso/philippe....ctive-tags.pdf (pages 7
and 8)
>
It's a quite convenient merge, to get the best of two worlds.
this is also my opinion ; you can achieve very complex things thanks to
very few active tags
>
Is ithat possible with xerces in c++???
sure ! as you explain it yourself, it's not a question of language
I cannot find any simple answer in apache doc

thanks
Alex

--
Cordialement,

///
(. .)
--------ooO--(_)--Ooo--------
| Philippe Poulard |
-----------------------------
http://reflex.gforge.inria.fr/
Have the RefleX !
Jan 10 '07 #2
The traditional technique for mixing SAX and DOM is to use a SAX parser
together with a SAX-driven DOM-tree builder, and to write a SAX handler
that filters the events appropriately before passing them to the builder.

Once you've got your filtered DOM, you can of course run a compatable
XPath implementation against it. DOM Level 3 introduced XPath support,
though not all DOMs implement that optional feature and I'm not sure
offhand whether Xerces-C's DOM includes it or not. If not, I presume
Xalan-C has an XPath API, though I'm not sure how efficiently it
interoperates with the Xerces-C DOM (Xalan prefers to manipulate its own
data model).

So the answer is: Yes, it's possible, though you may need to write a bit
of code to glue it all together.
Jan 10 '07 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Federico | last post by:
Hello, I have a problem: I want to increase the "upload_max_filesize" to upload bigger than 2Mb files. I have modified the php.ini file, but php continues applying the previos 2Mb limit. What...
3
by: Steven Burn | last post by:
The application; Service on my webserver that allows a user to upload their HOSTS file for functions to verify the contents are still valid. Uses; 1. XMLHTTP (MSXML2) 2. FileSystemObject...
3
by: Kevin | last post by:
Does anyone have a suggestion for parsing large files line by line without loading the entire file into memory first? I don't want to use file() because the files I'm working with may be...
3
by: Buddy Ackerman | last post by:
I'm trying to write files directly to the client so that it forces the client to open the Save As dialog box rather than display the file. On some occasions the files are very large (100MB+). On...
3
by: A.M-SG | last post by:
Hi, I have a ASP.NET aspx file that needs to pass large images from a network storage to client browser. The requirement is that users cannot have access to the network share. The aspx file...
2
by: WSE with SSL and large amount of data | last post by:
Hi there, What's the better strategy for uploading large files trough webservices? I can use Dime/WS-Attachments but for files with over 5MB in size, maybe I got some timeout/refuse problems in...
6
by: comp.lang.php | last post by:
if (!function_exists('bigfile')) { /** * Works like file() in PHP except that it will work more efficiently with very large files * * @access public * @param mixed $fullFilePath * @return...
1
by: Lars B | last post by:
Hey guys, I have written a C++ program that passes data from a file to an FPGA board and back again using software and DMA buffers. In my program I need to compare the size of a given file against...
17
by: byte8bits | last post by:
How does C++ safely open and read very large files? For example, say I have 1GB of physical memory and I open a 4GB file and attempt to read it like so: #include <iostream> #include <fstream>...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
1
muto222
php
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.