473,396 Members | 1,775 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

Xml parser performance and Xml generation

Hello,

I need to parse relatively small XML files (less than 100 kB) and to
write XML files that can get quite big (the bigger I can get without
having performance issues, the better).

I looked for benchmark to help me choose beetween existing parser but
didn't manage to find a suitable one. I was looking for something
similar to the following one I found for python :
http://effbot.org/zone/celementtree.htm#benchmarks

I know benchmarks are not perfect but they could help me anyway. For
example, I wonder if SimpleXML uses a lot of memory and is fast
despite its simplicity.

I'd also like to know if writing the XML file using plain strings
would be a lot faster and/or would use a lot less memory than using a
parser to build them. And if I choose this option, what is the fastest
way to build big strings in PHP ?

Thank you,
Bastien

Sep 5 '07 #1
2 2416

"Bastien Continsouzas" <de*@continsouzas.comwrote in message
news:11**********************@19g2000hsx.googlegro ups.com...
Hello,

I need to parse relatively small XML files (less than 100 kB) and to
write XML files that can get quite big (the bigger I can get without
having performance issues, the better).
i don't think you'll notice any appreciable difference between parsers with
file sizes that small.
I'd also like to know if writing the XML file using plain strings
would be a lot faster and/or would use a lot less memory than using a
parser to build them. And if I choose this option, what is the fastest
way to build big strings in PHP ?
it all depends on the xml you are writing. will your output contain
marked-up elements (data typing, etc.) or will it be as simple as:

<records>
<record>
<field1 />
<field2 />
<etc />
</record>
</records>

will it be stand-alone or pull schema info into the mix?

it will always be quicker to put it together yourself...however, doing so
can complicate the maintenance of the script doing the output and you could
be very prone to duplicating your code. that's if your script is written in
procedural contexts.

having said that, you do not need a parser to ever output your xml. to avoid
the afformentioned, you could make appropriate class objects that describe
your data, then give them a toXml() function where you use that information
to build the appropriate output. this is fundamentally different than
procedural code and is what i recommend...especially since you can do so
many other things with the object, like providing a toCsv() interface and
things of that nature including extending the object to create slightly
different variants that may need to 'look' different based on the overall
xml you'd be including it into - it may even depend on having a different
'look' based on the client for whom you are producing it.

just a thought.
Sep 5 '07 #2
"Bastien Continsouzas" <de*@continsouzas.comwrote in message
news:11**********************@19g2000hsx.googlegro ups.com...
>Hello,

I need to parse relatively small XML files (less than 100 kB) and to
write XML files that can get quite big (the bigger I can get without
having performance issues, the better).

i don't think you'll notice any appreciable difference between parsers
with file sizes that small.
>I'd also like to know if writing the XML file using plain strings
would be a lot faster and/or would use a lot less memory than using a
parser to build them. And if I choose this option, what is the fastest
way to build big strings in PHP ?
one last thing...if your xml is going to run over your memory constraints,
you'd want to look at writing to file either in chuncks or line by line.
then do with the file what you intend, and then kill it.
Sep 5 '07 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Simon Foster | last post by:
Anyone have any experience or pointers to how to go about creating a parser lexer for assemble in Python. I was thinking of using PLY but wonder whether it's too heavyweight for what I want. ...
3
by: a_nuther | last post by:
I'm building something that requires parsing a rather complex language. I'd like to do the whole application, including the lex/parse phase, in Python (for development/debug speed), and only move...
3
by: garth_rockett | last post by:
We need to process a very large amount of delimited variable length ASCII data in files as large as 3-4 gigs. We need a high performance parser for this and as always, we have no money to buy one....
6
by: Mike C# | last post by:
Hi all, Can anyone recommend a good and *easy to use* lexer and parser generator? Preferably one that was written specifically for VC++ and not mangled through 20 different platforms. I've had...
18
by: Just Another Victim of the Ambient Morality | last post by:
Is pyparsing really a recursive descent parser? I ask this because there are grammars it can't parse that my recursive descent parser would parse, should I have written one. For instance: ...
8
by: Filipe Fernandes | last post by:
I have a project that uses a proprietary format and I've been using regex to extract information from it. I haven't hit any roadblocks yet, but I'd like to use a parsing library rather than...
1
by: sunil | last post by:
Hello, I am working on a problem where I will have a boolean expression with upto four variables: A,B,C,D and connected by basic operator &&,||and may be XOR and NOT in future AND has higher...
4
by: fbrewster | last post by:
I'm writing an HTML parser and would like to use Internet Explorers DOM parser. Can I use Internet Explorers DOM parser through a web service? thanks for the help
2
by: Chris | last post by:
Can anyone recommend a good HTML/XHTML parser, similar to HTMLParser.HTMLParser or htmllib.HTMLParser, but able to intelligently know that certain tags, like <br>, are implicitly closed? I need to...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.