473,395 Members | 1,742 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

process large file

I am having some trouble processing some large file (40mb) in Java. For the
problem I have, I have tried to use SAX, but doesn't
find it suitable (well, coding just becomes a little complicated). So DOM is
better but a little overhead on memory. Can someone share with me their
experiences in dealing with their situations?

Cheers,
JZ
Jul 20 '05 #1
4 1553
"Jimmy Zhang" wrote in message news:<2E3cc.182494$1p.2161144@attbi_s54>...
I am having some trouble processing some large file (40mb) in Java. For the
problem I have, I have tried to use SAX, but doesn't
find it suitable (well, coding just becomes a little complicated). So DOM is
better but a little overhead on memory. Can someone share with me their
experiences in dealing with their situations?


One good option is to use a pull parser. It has a simpler interface
than SAX, but doesn't have the memory overhead of DOM.

See http://www.extreme.indiana.edu/xgws/xsoap/xpp/ or
http://www.xmlpull.org/

Toivo Lainevool
http://www.XMLPatterns.com - Develop effective DTDs and XML Schema
documents for your XML using structural design patterns.
Jul 20 '05 #2
Hello, Toivo!
You wrote on 7 Apr 2004 12:09:05 -0700:
[Sorry, skipped]

TL> One good option is to use a pull parser. It has a simpler interface
TL> than SAX, but doesn't have the memory overhead of DOM.

TL> See http://www.extreme.indiana.edu/xgws/xsoap/xpp/ or
TL> http://www.xmlpull.org/

Intresting, that MS calls it cursor model processing (XPathNavigator), and
based on SAX calls push/pull model (XmlWriter/XmlReader).

With best regards, Alexey Shirshov.
Jul 20 '05 #3
How much memory does XPathNavigator consume? I assume it loads everything in
memory like DOM.
"Alexey Shirshov" <al****@rsdn.ru> wrote in message
news:c5**********@news.gamma.ru...
Hello, Toivo!
You wrote on 7 Apr 2004 12:09:05 -0700:
[Sorry, skipped]

TL> One good option is to use a pull parser. It has a simpler interface
TL> than SAX, but doesn't have the memory overhead of DOM.

TL> See http://www.extreme.indiana.edu/xgws/xsoap/xpp/ or
TL> http://www.xmlpull.org/

Intresting, that MS calls it cursor model processing (XPathNavigator), and
based on SAX calls push/pull model (XmlWriter/XmlReader).

With best regards, Alexey Shirshov.

Jul 20 '05 #4
Hello, Jimmy!
You wrote on Thu, 08 Apr 2004 23:40:00 GMT:

JZ> How much memory does XPathNavigator consume? I assume it loads
JZ> everything in memory like DOM.

Well, XPathNavigator is just an interface (actually abstract class) and we
cann't talk about it performance. The important thing is an implementation
of this class. XmlDocument, which represents DOM implements it in
DocumentXPathNavigator class.
You can create the instance of this class via CreateNavigator method.
Another implementation you can get via CreateNavigator of the XPathDocument
class.
First implementation uses DOM as underlying data model, while the second -
XPath data model.
I think, for very large documents the XPathDocument will be much faster.
[Sorry, skipped]
With best regards, Alexey Shirshov.
Jul 20 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Peter Åstrand | last post by:
There's a new PEP available: PEP 324: popen5 - New POSIX process module A copy is included below. Comments are appreciated. ---- PEP: 324 Title: popen5 - New POSIX process module
4
by: Mountain Bikn' Guy | last post by:
I am having serious problems with the following IDE bug: Could not write to output file 'x.dll' -- 'The process cannot access the file because it is being used by another process. ' and BUG:...
8
by: shandra | last post by:
I have a file I need to delete or truncate. I tried using the KILL command in VB6. I tried using the file.delete command in VB.net. I tried manually deleting, renaming, and copying over the...
0
by: Matt | last post by:
I have used the below code to basically merge some wav files into one large wav file with success in the past. But since moving over to win2k3 and ASP.NET 2.0 this is no longer doable. No...
22
by: Zen | last post by:
Hi, My production machine has 2G of memory, when aspnet_wp.exe goes up to about ~1.2G of memory usage, I start get out-of-memory exception. Other processes don't use as much memory and I added...
6
by: quamis | last post by:
Hy, i need to process every character in a file, so i open the file read in buffers of about 8192bytes and process each buffer, then i write the output to another file. the problem is that with...
28
by: Jon Davis | last post by:
We're looking at running a memory-intensive process for a web site as a Windows service in isolation of IIS because IIS refuses to consume all of the available physical RAM. Considering remoting to...
7
by: malkarouri | last post by:
Hi everyone, I have written a function that runs functions in separate processes. I hope you can help me improving it, and I would like to submit it to the Python cookbook if its quality is good...
6
by: Terry Carroll | last post by:
I am trying to do something with a very large tarfile from within Python, and am running into memory constraints. The tarfile in question is a 4-gigabyte datafile from freedb.org,...
7
by: dieter | last post by:
Hi, Overview ======= I'm doing some simple file manipulation work and the process gets "Killed" everytime I run it. No traceback, no segfault... just the word "Killed" in the bash shell and...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.