473,385 Members | 1,983 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Parsing XML file slow!

Hi everyone,

I've written some code that parses an XML file. The whole thing works correctly, but it's really slow. My example only has two elem_x elements. A normal XML file could theoretically have up to a 5-figure number of elem_x elements. I'm testing with a file that has 150 elem_x elements. And it's just taking too long. Basically I'm trying to turn the XML file into a column-based format, as follows:

Input XML file

<elem_x att_a="1" att_b="2" att_c="3">
<elem_y>
<elem_z att_k="9" att_l="8" att_m="7"></elem_z>
</elem_y>
<elem_y lang="EN-US">
<elem_z att_k="9" att_m="7"></elem_z>
<elem_r>textEN</elem_r>
</elem_y>
<elem_y lang="DE-DE">
<elem_z att_k="9" att_l="8" att_m="7"></elem_z>
<elem_r>textDE</elem_r>
</elem_y>
</elem_x>
<elem_x att_a="4" att_b="5" att_c="6">
<elem_y>
<elem_z att_k="6" att_l="5" att_m="4"></elem_z>
</elem_y>
<elem_y lang="EN-US">
<elem_z att_k="6" att_l="5" att_m="4"></elem_z>
<elem_r>textEN</elem_r>
</elem_y>
<elem_y lang="DE-DE">
<elem_z att_k="6" att_l="5" att_m="4"></elem_z>
<elem_r>textDE</elem_r>
</elem_y>
</elem_x>...

The parsed output is a tab-separated file and should look something like this:

att_a att_b att_c att_k att_l att_m EN-US DE-DE
1 2 3 9 7 textEN textDE
4 5 6 6 5 4... TextEN textDE

Since some attributes can be missing in a particular elememt, I have to loop through the entire file to ensure that the column order does not get mixed up. To complicate the matter slightly, I only want to read the attributes from one of the elem_y elements as they are always the same for each elem_y.

I've used the XMLDocument class and using Xpath and SelectedNodes I can drill down through the XML file, navigating to each node block and looping through it, reading the attribute names and values accordingly. By doing this I can build an array which I can then write to the output file. However, I have a feeling my problem is the high number of loops, which is slowing everything down. I've parsed the XML file using an XmlReader and loaded it into a dataset. This is much fastrer, but it just does not seem to help me solve my problem as the attributes for elem_z are not read out on one line, but line by line.

Is XML my problem? Should I try and use XSLT to transform the XML instead? Or would simply parsing it as a text file be more effective?

Any assistance would be greatly appreciated.

Robert
Feb 28 '06 #1
0 1200

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

19
by: Alex Mizrahi | last post by:
Hello, All! i have 3mb long XML document with about 150000 lines (i think it has about 200000 elements there) which i want to parse to DOM to work with. first i thought there will be no...
1
by: Christoph Bisping | last post by:
Hello! Maybe someone is able to give me a little hint on this: I've written a vb.net app which is mainly an interpreter for specialized CAD/CAM files. These files mainly contain simple movement...
2
by: Mark | last post by:
Hi... We've been doing some basic performance testing comparing asp, asp.net, mono, and php. One of the basic tests is on simply parsing an xml document and streaming the result back to the...
5
by: _DS | last post by:
I'm currently using a switch with about 50 case statements in a stretch of code that's parsing XML attributes. Each case is a string. I'm told that switch statements will actually use hash tables...
1
by: Thomas Kowalski | last post by:
Hi, I have to parse a plain, ascii text file (on local HD). Since the file might be many millions lines long I want to improve the efficiency of my parsing process. The resulting data structure...
5
by: gomzi | last post by:
am currently using xmlreader (asp.net 2.0) for parsing an xml file.but since its too slow when the file size is quite large, i would like to know whether there is a faster way of getting the job done.
2
by: flyzone | last post by:
Goodmorning people :) I have just started to learn this language and i have a logical problem. I need to write a program to parse various file of text. Here two sample: --------------- trial...
13
by: Chris Carlen | last post by:
Hi: Having completed enough serial driver code for a TMS320F2812 microcontroller to talk to a terminal, I am now trying different approaches to command interpretation. I have a very simple...
31
by: broli | last post by:
I need to parse a file which has about 2000 lines and I'm getting told that reading the file in ascii would be a slower way to do it and so i need to resort to binary by reading it in large...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.