473,320 Members | 1,820 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

Element name length & performance implications

I know that longer element names increase the size of an XML document,
ultimately resulting in a larger amount of data at parse-time. Is there
anything else, specifically related to an element name and its length,
that can impact the performance of an XML parser?

The bulk of our XML parsing uses the latest and greatest version of
Apache Xerces.

Oct 25 '05 #1
2 1580
Tom Kerigan wrote:
ultimately resulting in a larger amount of data at parse-time. Is there
anything else, specifically related to an element name and its length,
that can impact the performance of an XML parser?


The number of attributes of an element may have
an influence. I have seen a parser (I think it
was xmllint) which seemed to have runtime O(n^2)
where n=number of attributes. This became unbearable
in some unlikely situations (more than 1000 attributes).
Oct 25 '05 #2
Jürgen Kahrs wrote:
Tom Kerigan wrote:
I know that longer element names increase the size of an XML document,
ultimately resulting in a larger amount of data at parse-time.
I'm not sure that that in itself would significantly affect the performance,
as reading bytes (which is all it's doing at that stage) is a relatively
low-level occupation. If the lexer is tokenising element type names and
storing them in some array-like data structure, big names will affect I/O
but not much else. But I'm happy to be proved wrong on that.
Is there
anything else, specifically related to an element name and its length,
that can impact the performance of an XML parser?

Depth can have an effect, especially in mixed content. I have relatively
small documents (4-5Mb) which are marked up very densely in TEI, with
deeply-nested structures such as variant readings of a manuscript or
linguistic (part-of-speech) markup in mixed content such that the character
data can be 15-20 levels below the root element. Nevertheless, onsgmls
rips through these in 5-8 seconds on a Dell 4150 running FC4/KDE/Emacs.

I have seen some truly ludicrous examples of data-oriented e-commerce XML
with element type names machine-generated from concatenated
database-table-field-relation[-field-relation]*-value names which ran to
400-500 characters, but the files were very small (40-50kb) so I'm not
sure what effect the names had on the parser (apart from the initial I/O).
The number of attributes of an element may have
an influence. I have seen a parser (I think it
was xmllint) which seemed to have runtime O(n^2)
where n=number of attributes. This became unbearable
in some unlikely situations (more than 1000 attributes).


Number of attributes could probably affect it, but anyone who "designs"
a document type with elements bearing 1000 attributes deserves all they
get, IMHO.

///Peter
--
XML FAQ: http://xml.silmaril.ie/

Oct 25 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

25
by: Brian Patterson | last post by:
I have noticed in the book of words that hasattr works by calling getattr and raising an exception if no such attribute exists. If I need the value in any case, am I better off using getattr...
6
by: Luke Dalessandro | last post by:
I'm not sure if this is the correct forum for platform specific (Mozilla/Firefox) javascript problems, so just shout and point me to the correct newsgroup if I'm being bad. Here's the deal... ...
175
by: Ken Brady | last post by:
I'm on a team building some class libraries to be used by many other projects. Some members of our team insist that "All public methods should be virtual" just in case "anything needs to be...
0
by: Rolf Wolf | last post by:
Hi, Is it possible to optimize that nice navigation-formating that the length of each block element is defined in the following way: fixed space ( here realized with text-indent: 20px;) + length...
12
by: Lloyd Dupont | last post by:
I have an application which use has a DLL with 100+ (auto-generated) Managed C++ wrapper around some native API. Compare to a purely version my application has some performance issue and I just...
0
by: xenia200 | last post by:
I HAVE THIS FOR EXAMPLE <misc> <cntextdef /> <vntextdef /> <xreflist> <xrefdef id="1" name="`Heading &amp; Page'"> “<ut value="&lt;$paratext&gt;" />” on page<ut...
2
dlite922
by: dlite922 | last post by:
Before traversing my code, here's what my goal is and what this function does: I have a table of fields that dynamically grows as the user enters information. A minimum of 3 rows must always...
1
by: ZZyZX | last post by:
I wish to uniquely identify the below button element embedded among many other button elements in a web page and call the click event on it to automatically navigate to the next page in a sequence...
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
0
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.